Novel plant tryptophan synthase beta subunit

ABSTRACT

This invention relates to an isolated nucleic acid fragment encoding a novel plant tryptophan synthase beta subunit. The invention also relates to the construction of a chimeric gene encoding all or a substantial portion of the novel plant tryptophan synthase beta subunit, in sense or antisense orientation, wherein expression of the chimeric gene results in production of altered levels of the novel plant tryptophan synthase beta subunit in a transformed host cell.

[0001] This application claims the benefit of U.S. ProvisionalApplication No. 60/139,568, filed Jun. 16, 1999.

FIELD OF THE INVENTION

[0002] This invention is in the field of plant molecular biology. Morespecifically, this invention pertains to nucleic acid fragments encodinga novel tryptophan synthase beta subunit in plants and seeds.

BACKGROUND OF THE INVENTION

[0003] Many vertebrates, including man, lack the ability to manufacturea number of amino acids and therefore require these amino acidspreformed in their diet. These are called essential amino acids. Plantsare able to synthesize all twenty amino acids and serve as the ultimatesource of the essential amino acids for humans and animals. Thus, theability to manipulate the production and accumulation of the essentialamino acids in plants is of considerable importance and value.Furthermore, the inability of animals to synthesize these amino acidsprovides a useful distinction between animal and plant cellularmetabolism. This can be exploited for the discovery of herbicidalchemical compounds that target enzymes in the plant biosyntheticpathways of the essential amino acids and have low toxicity to animals.

[0004] In plants the tryptophan pathway leads to the biosynthesis ofmany secondary metabolites including the hormone indole-3-acetic acid,antimicrobial phytoalexins, alkaloids and glucosinolates. The two finalreactions in tryptophan biosynthesis are catalyzed by tryptophansynthase. The 29 kDa alpha subunit is a bifunctional enzyme whichcleaves indole-3-glycerol phosphate to produce indole andglyceraldehyde-3-phosphate. The beta subunit joins indole with serine toform tryptophan. Either subunit alone is enzymatically active, but therate of the reaction and affinity for the substrates increases when thesubunits are forming a tetramer composed of two alpha subunits and twobeta subunits (Radwanski (1995) Mol. Gen. Genet. 248:657-667).

[0005] Few of the genes encoding enzymes from the tryptophan pathway incorn, soybeans, rice and wheat have been isolated and sequenced. Corngenes encoding tryptophan synthase beta subunits have been identified(Wright et al. (1992) Plant Cell 4:711-719). The instant inventiondescribes novel corn, rice, soybean, and wheat tryptophan synthase betasubunit homologs.

SUMMARY OF THE INVENTION

[0006] The present invention concerns an isolated polynucleotidecomprising a nucleotide sequence selected from the group consisting of:(a) a first nucleotide sequence comprising a polynucleotide of at least700 nucleotides from SEQ ID NO:1; (b) a second nucleotide sequencecomprising a polynucleotide sequence of at least 420 nucleotides fromthe group consisting of SEQ ID NOs:3, 5, 10, and 12; (c) a thirdnucleotide sequence encoding a polypeptide of at least 100 amino acidshaving at least 80% identity based on the Clustal method of alignmentwhen compared to a polypeptide selected from the group consisting of SEQID NOs:2, 4, 6, 11, and 13; and (d) a fourth nucleotide sequencecomprising a complement of the first or second nucleotide sequences.

[0007] In a second embodiment, this invention concerns an isolatedpolynucleotide comprising a nucleotide sequence of at least one of 60(preferably at least one of 40, most preferably at least one of 30)contiguous nucleotides derived from a nucleotide sequence selected fromthe group consisting of SEQ ID NOs:1, 3, 5, 10, and 12 and thecomplement of such nucleotide sequences.

[0008] In a third embodiment, this invention relates to a chimeric genecomprising an isolated polynucleotide of the present invention operablylinked to at least one suitable regulatory sequence.

[0009] In a fourth embodiment, the present invention concerns a hostcell comprising a chimeric gene of the present invention or an isolatedpolynucleotide of the present invention. The host cell may beeukaryotic, such as a yeast or a plant cell, or prokaryotic, such as abacterial cell. The present invention also relates to a virus,preferably a baculovirus, comprising an isolated polynucleotide of thepresent invention or a chimeric gene of the present invention.

[0010] In a fifth embodiment, the invention also relates to a processfor producing a host cell comprising a chimeric gene of the presentinvention or an isolated polynucleotide of the present invention, theprocess comprising either transforming or transfecting a compatible hostcell with a chimeric gene or isolated polynucleotide of the presentinvention.

[0011] In a sixth embodiment, the invention concerns a novel planttryptophan synthase beta subunit polypeptide of at least 100 amino acidscomprising at least 80% identity based on the Clustal method ofalignment when compared to a polypeptide selected from the groupconsisting of SEQ ID NOs:2, 4, 6, 11, and 13.

[0012] In an seventh embodiment, the invention relates to a method ofselecting an isolated polynucleotide that affects the level ofexpression of a novel plant tryptophan synthase beta subunit polypeptideor enzyme activity in a host cell, preferably a plant cell, the methodcomprising the steps of: (a) constructing an isolated polynucleotide ofthe present invention or a chimeric gene of the present invention; (b)introducing the isolated polynucleotide or the chimeric gene into a hostcell; (c) measuring the level of the novel plant tryptophan synthasebeta subunit polypeptide or enzyme activity in the host cell containingthe isolated polynucleotide; and (d) comparing the level of the novelplant tryptophan synthase beta subunit polypeptide or enzyme activity inthe host cell containing the isolated polynucleotide with the level ofthe novel plant tryptophan synthase beta subunit polypeptide or enzymeactivity in the host cell that does not contain the isolatedpolynucleotide.

[0013] In a eighth embodiment, the invention concerns a method ofobtaining a nucleic acid fragment encoding a substantial portion of anovel plant tryptophan synthase beta subunit polypeptide, preferably aplant novel plant tryptophan synthase beta subunit polypeptide,comprising the steps of: synthesizing an oligonucleotide primercomprising a nucleotide sequence of at least one of 60 (preferably atleast one of 40, most preferably at least one of 30) contiguousnucleotides derived from a nucleotide sequence selected from the groupconsisting of SEQ ID NOs:1, 3, 5, 10, and 12 and the complement of suchnucleotide sequences; and amplifying a nucleic acid fragment (preferablya cDNA inserted in a cloning vector) using the oligonucleotide primer.The amplified nucleic acid fragment preferably will encode a substantialportion of a novel plant tryptophan synthase beta subunit amino acidsequence.

[0014] In a ninth embodiment, this invention relates to a method ofobtaining a nucleic acid fragment encoding all or a substantial portionof the amino acid sequence encoding a novel plant tryptophan synthasebeta subunit polypeptide comprising the steps of: probing a cDNA orgenomic library with an isolated polynucleotide of the presentinvention; identifying a DNA clone that hybridizes with an isolatedpolynucleotide of the present invention; isolating the identified DNAclone; and sequencing the cDNA or genomic fragment that comprises theisolated DNA clone.

[0015] In a tenth embodiment, this invention concerns a composition,such as a hybridization mixture, comprising an isolated polynucleotideor an isolated polypeptide of the present invention.

[0016] In an eleventh embodiment, this invention concerns a method forpositive selection of a transformed cell comprising: (a) transforming ahost cell with the chimeric gene of the present invention or a constructof the present invention; and (b) growing the transformed host cell,preferably a plant cell, such as a monocot or a dicot, under conditionswhich allow expression of the novel plant tryptophan synthase betasubunit polynucleotide in an amount sufficient to complement a nullmutant to provide a positive selection means.

[0017] In a twelfth embodiment, this invention relates to a method ofaltering the level of expression of a novel plant tryptophan synthasebeta subunit in a host cell comprising: (a) transforming a host cellwith a chimeric gene of the present invention; and (b) growing thetransformed host cell under conditions that are suitable for expressionof the chimeric gene wherein expression of the chimeric gene results inproduction of altered levels of the novel plant tryptophan synthase betasubunit in the transformed host cell.

[0018] A further embodiment of the instant invention is a method forevaluating at least one compound for its ability to inhibit the activityof a novel plant tryptophan synthase beta subunit, the method comprisingthe steps of: (a) transforming a host cell with a chimeric genecomprising a nucleic acid fragment encoding a novel plant tryptophansynthase beta subunit polypeptide, operably linked to at least onesuitable regulatory sequence; (b) growing the transformed host cellunder conditions that are suitable for expression of the chimeric genewherein expression of the chimeric gene results in production of thetryptophan synthase beta subunit in the transformed host cell; (c)optionally purifying the novel plant tryptophan synthase beta subunitpolypeptide expressed by the transformed host cell; (d) treating thenovel plant tryptophan synthase beta subunit polypeptide with a compoundto be tested; and (e) comparing the activity of the novel planttryptophan synthase beta subunit polypeptide that has been treated witha test compound to the activity of an untreated novel plant tryptophansynthase beta subunit polypeptide, thereby selecting compounds withpotential for inhibitory activity.

BRIEF DESCRIPTION OF THE DRAWINGS AND SEQUENCE LISTINGS

[0019] The invention can be more fully understood from the followingdetailed description, the accompanying drawings and Sequence Listingwhich form a part of this application.

[0020]FIG. 1 depicts the amino acid sequence alignment between thetryptophan synthase beta subunit from Aquifex aeolicus (NCBI GeneralIdentifier No. 293814; SEQ ID NO:7), the instant corn cloneccase-b.pk0015.e7 (SEQ ID NO:2), the instant soybean clonesfl1.pk131.b23 (SEQ ID NO:4), and the instant wheat clonewdk2c.pk005.o10:fis (SEQ ID NO:13). Amino acids which are identicalamong all sequences are indicated with an asterisk (*) above thealignment while those conserved only among the plant sequences areindicated by a plus sign (+) above the alignment. Dashes are used by theprogram to maximize alignment of the sequences.

[0021]FIG. 2 depicts the amino acid sequence alignment between thetryptophan synthase beta subunits from corn having NCBI GeneralIdentifier No. 168572 (SEQ ID NO:8) and NCBI General Identifier No.168574 (SEQ ID NO:9) and the instant corn clone ccase-b.pk0015.e7 (SEQID NO:2). Amino acids which are identical among all sequences areindicated with an asterisk (*) above the alignment while those conservedonly among the sequences in the public domain are indicated by a plussign (+) above the alignment. Dashes are used by the program to maximizealignment of the sequences.

[0022] Table 1 lists the polypeptides that are described herein, thedesignation of the cDNA clones that comprise the nucleic acid fragmentsencoding polypeptides representing all or a substantial portion of thesepolypeptides, and the corresponding identifier (SEQ ID NO:) as used inthe attached Sequence Listing. The sequence descriptions and SequenceListing attached hereto comply with the rules governing nucleotideand/or amino acid sequence disclosures in patent applications as setforth in 37 C.F.R. §1.821-1.825. TABLE 1 Novel Plant Tryptophan SynthaseBeta Subunit SEQ ID NO: (Amino Protein Clone Designation (Nucleotide)Acid) Corn Tryptophan ccase-b.pk0015.e7 1 2 Synthase Beta SubunitSoybean Tryptophan sfl1.pk131.b23 3 4 Synthase Beta Subunit WheatTryptophan wdk2c.pk005.o10 5 6 Synthase Beta Subunit Rice TryptophanContig of: 10  11  Synthase Beta Subunit rdr1f.pk003.m17 rlr12.pk0007.g7Wheat Tryptophan wdk2c.pk005.o10:fis 12  13  Synthase Beta Subunit

[0023] The Sequence Listing contains the one letter code for nucleotidesequence characters and the three letter codes for amino acids asdefined in conformity with the IUPAC-IUBMB standards described inNucleic Acids Res. 13:3021-3030 (1985) and in the Biochemical J. 219(No. 2):345-373 (1984) which are herein incorporated by reference. Thesymbols and format used for nucleotide and amino acid sequence datacomply with the rules set forth in 37 C.F.R. §1.822.

DETAILED DESCRIPTION OF THE INVENTION

[0024] In the context of this disclosure, a number of terms shall beutilized. The terms “polynucleotide”, “polynucleotide sequence”,“nucleic acid sequence”, and “nucleic acid fragment”/“isolated nucleicacid fragment” are used interchangeably herein. These terms encompassnucleotide sequences and the like. A polynucleotide may be a polymer ofRNA or DNA that is single- or double-stranded, that optionally containssynthetic, non-natural or altered nucleotide bases. A polynucleotide inthe form of a polymer of DNA may be comprised of one or more segments ofcDNA, genomic DNA, synthetic DNA, or mixtures thereof. An isolatedpolynucleotide of the present invention may include at least one of 60contiguous nucleotides, preferably at least one of 40 contiguousnucleotides, most preferably one of at least 30 contiguous nucleotidesderived from SEQ ID NOs:1, 3, 5, 10, and 12, or the complement of suchsequences.

[0025] The term “isolated polynucleotide” refers to a polynucleotidethat is substantially free from other nucleic acid sequences, such asand not limited to other chromosomal and extrachromosomal DNA and RNA.Isolated polynucleotides may be purified from a host cell in which theynaturally occur. Conventional nucleic acid purification methods known toskilled artisans may be used to obtain isolated polynucleotides. Theterm also embraces recombinant polynucleotides and chemicallysynthesized polynucleotides.

[0026] The term “recombinant” means, for example, that a nucleic acidsequence is made by an artificial combination of two otherwise separatedsegments of sequence, e.g., by chemical synthesis or by the manipulationof isolated nucleic acids by genetic engineering techniques.

[0027] As used herein, “contig” refers to a nucleotide sequence that isassembled from two or more constituent nucleotide sequences that sharecommon or overlapping regions of sequence homology. For example, thenucleotide sequences of two or more nucleic acid fragments can becompared and aligned in order to identify common or overlappingsequences. Where common or overlapping sequences exist between two ormore nucleic acid fragments, the sequences (and thus their correspondingnucleic acid fragments) can be assembled into a single contiguousnucleotide sequence.

[0028] As used herein, “substantially similar” refers to nucleic acidfragments wherein changes in one or more nucleotide bases results insubstitution of one or more amino acids, but do not affect thefunctional properties of the polypeptide encoded by the nucleotidesequence. “Substantially similar” also refers to nucleic acid fragmentswherein changes in one or more nucleotide bases does not affect theability of the nucleic acid fragment to mediate alteration of geneexpression by gene silencing through for example antisense orco-suppression technology. “Substantially similar” also refers tomodifications of the nucleic acid fragments of the instant inventionsuch as deletion or insertion of one or more nucleotides that do notsubstantially affect the functional properties of the resultingtranscript vis-à-vis the ability to mediate gene silencing or alterationof the functional properties of the resulting protein molecule. It istherefore understood that the invention encompasses more than thespecific exemplary nucleotide or amino acid sequences and includesfunctional equivalents thereof. The terms “substantially similar” and“corresponding substantially” are used interchangeably herein.

[0029] Substantially similar nucleic acid fragments may be selected byscreening nucleic acid fragments representing subfragments ormodifications of the nucleic acid fragments of the instant invention,wherein one or more nucleotides are substituted, deleted and/orinserted, for their ability to affect the level of the polypeptideencoded by the unmodified nucleic acid fragment in a plant or plantcell. For example, a substantially similar nucleic acid fragmentrepresenting at least one of 30 contiguous nucleotides derived from theinstant nucleic acid fragment can be constructed and introduced into aplant or plant cell. The level of the polypeptide encoded by theunmodified nucleic acid fragment present in a plant or plant cellexposed to the substantially similar nucleic fragment can then becompared to the level of the polypeptide in a plant or plant cell thatis not exposed to the substantially similar nucleic acid fragment.

[0030] For example, it is well known in the art that antisensesuppression and co-suppression of gene expression may be accomplishedusing nucleic acid fragments representing less than the entire codingregion of a gene, and by using nucleic acid fragments that do not share100% sequence identity with the gene to be suppressed. Moreover,alterations in a nucleic acid fragment which result in the production ofa chemically equivalent amino acid at a given site, but do not effectthe functional properties of the encoded polypeptide, are well known inthe art. Thus, a codon for the amino acid alanine, a hydrophobic aminoacid, may be substituted by a codon encoding another less hydrophobicresidue, such as glycine, or a more hydrophobic residue, such as valine,leucine, or isoleucine. Similarly, changes which result in substitutionof one negatively charged residue for another, such as aspartic acid forglutamic acid, or one positively charged residue for another, such aslysine for arginine, can also be expected to produce a functionallyequivalent product. Nucleotide changes which result in alteration of theN-terminal and C-terminal portions of the polypeptide molecule wouldalso not be expected to alter the activity of the polypeptide. Each ofthe proposed modifications is well within the routine skill in the art,as is determination of retention of biological activity of the encodedproducts. Consequently, an isolated polynucleotide comprising anucleotide sequence of at least one of 60 (preferably at least one of40, most preferably at least one of 30) contiguous nucleotides derivedfrom a nucleotide sequence selected from the group consisting of SEQ IDNOs:1, 3, 5, 10, and 12, and the complement of such nucleotide sequencesmay be used in methods of selecting an isolated polynucleotide thataffects the expression of a novel plant tryptophan synthase beta subunitpolypeptide in a host cell. A method of selecting an isolatedpolynucleotide that affects the level of expression of a polypeptide ina virus or in a host cell (eukaryotic, such as plant or yeast,prokaryotic such as bacterial) may comprise the steps of: constructingan isolated polynucleotide of the present invention or a chimeric geneof the present invention; introducing the isolated polynucleotide or thechimeric gene into a host cell; measuring the level of a polypeptide orenzyme activity in the host cell containing the isolated polynucleotide;and comparing the level of a polypeptide or enzyme activity in the hostcell containing the isolated polynucleotide with the level of apolypeptide or enzyme activity in a host cell that does not contain theisolated polynucleotide.

[0031] Moreover, substantially similar nucleic acid fragments may alsobe characterized by their ability to hybridize. Estimates of suchhomology are provided by either DNA-DNA or DNA-RNA hybridization underconditions of stringency as is well understood by those skilled in theart (Hames and Higgins, Eds. (1985) Nucleic Acid Hybridisation, IRLPress, Oxford, U.K.). Stringency conditions can be adjusted to screenfor moderately similar fragments, such as homologous sequences fromdistantly related organisms, to highly similar fragments, such as genesthat duplicate functional enzymes from closely related organisms.Post-hybridization washes determine stringency conditions. One set ofpreferred conditions uses a series of washes starting with 6×SSC, 0.5%SDS at room temperature for 15 min, then repeated with 2×SSC, 0.5% SDSat 45° C. for 30 min, and then repeated twice with 0.2×SSC, 0.5% SDS at50° C. for 30 min. A more preferred set of stringent conditions useshigher temperatures in which the washes are identical to those aboveexcept for the temperature of the final two 30 min washes in 0.2×SSC,0.5% SDS which was increased to 60° C. Another preferred set of highlystringent conditions uses two final washes in 0.1×SSC, 0.1% SDS at 65°C.

[0032] Substantially similar nucleic acid fragments of the instantinvention may also be characterized by the percent identity of the aminoacid sequences that they encode to the amino acid sequences disclosedherein, as determined by algorithms commonly employed by those skilledin this art. Suitable nucleic acid fragments (isolated polynucleotidesof the present invention) encode polypeptides that are at least about70% identical, preferably at least about 80% identical to the amino acidsequences reported herein. Preferred nucleic acid fragments encode aminoacid sequences that are about 85% identical to the amino acid sequencesreported herein. More preferred nucleic acid fragments encode amino acidsequences that are at least about 90% identical to the amino acidsequences reported herein. Most preferred are nucleic acid fragmentsthat encode amino acid sequences that are at least about 95% identicalto the amino acid sequences reported herein. Suitable nucleic acidfragments not only have the above identities but typically encode apolypeptide having at least 50 amino acids, preferably at least 100amino acids, more preferably at least 150 amino acids, still morepreferably at least 200 amino acids, and most preferably at least 250amino acids. Sequence alignments and percent identity calculations wereperformed using the Megalign program of the LASERGENE bioinformaticscomputing suite (DNASTAR Inc., Madison, Wis.). Multiple alignment of thesequences was performed using the Clustal method of alignment (Higginsand Sharp (1989) CABIOS. 5:151-153) with the default parameters (GAPPENALTY=10, GAP LENGTH PENALTY=10). Default parameters for pairwisealignments using the Clustal method were KTUPLE 1, GAP PENALTY=3,WINDOW=5 and DIAGONALS SAVED=5.

[0033] A “substantial portion” of an amino acid or nucleotide sequencecomprises an amino acid or a nucleotide sequence that is sufficient toafford putative identification of the protein or gene that the aminoacid or nucleotide sequence comprises. Amino acid and nucleotidesequences can be evaluated either manually by one skilled in the art, orby using computer-based sequence comparison and identification toolsthat employ algorithms such as BLAST (Basic Local Alignment Search Tool;Altschul et al. (1993) J. Mol. Biol. 215:403-410; see alsowww.ncbi.nlm.nih.gov/BLAST/). In general, a sequence of ten or morecontiguous amino acids or thirty or more contiguous nucleotides isnecessary in order to putatively identify a polypeptide or nucleic acidsequence as homologous to a known protein or gene. Moreover, withrespect to nucleotide sequences, gene-specific oligonucleotide probescomprising 30 or more contiguous nucleotides may be used insequence-dependent methods of gene identification (e.g., Southernhybridization) and isolation (e.g., in situ hybridization of bacterialcolonies or bacteriophage plaques). In addition, short oligonucleotidesof 12 or more nucleotides may be used as amplification primers in PCR inorder to obtain a particular nucleic acid fragment comprising theprimers. Accordingly, a “substantial portion” of a nucleotide sequencecomprises a nucleotide sequence that will afford specific identificationand/or isolation of a nucleic acid fragment comprising the sequence. Theinstant specification teaches amino acid and nucleotide sequencesencoding polypeptides that comprise one or more particular plantproteins. The skilled artisan, having the benefit of the sequences asreported herein, may now use all or a substantial portion of thedisclosed sequences for purposes known to those skilled in this art.Accordingly, the instant invention comprises the complete sequences asreported in the accompanying Sequence Listing, as well as substantialportions of those sequences as defined above.

[0034] “Codon degeneracy” refers to divergence in the genetic codepermitting variation of the nucleotide sequence without effecting theamino acid sequence of an encoded polypeptide. Accordingly, the instantinvention relates to any nucleic acid fragment comprising a nucleotidesequence that encodes all or a substantial portion of the amino acidsequences set forth herein. The skilled artisan is well aware of the“codon-bias” exhibited by a specific host cell in usage of nucleotidecodons to specify a given amino acid. Therefore, when synthesizing anucleic acid fragment for improved expression in a host cell, it isdesirable to design the nucleic acid fragment such that its frequency ofcodon usage approaches the frequency of preferred codon usage of thehost cell.

[0035] “Synthetic nucleic acid fragments” can be assembled fromoligonucleotide building blocks that are chemically synthesized usingprocedures known to those skilled in the art. These building blocks areligated and annealed to form larger nucleic acid fragments which maythen be enzymatically assembled to construct the entire desired nucleicacid fragment. “Chemically synthesized”, as related to a nucleic acidfragment, means that the component nucleotides were assembled in vitro.Manual chemical synthesis of nucleic acid fragments may be accomplishedusing well established procedures, or automated chemical synthesis canbe performed using one of a number of commercially available machines.Accordingly, the nucleic acid fragments can be tailored for optimal geneexpression based on optimization of the nucleotide sequence to reflectthe codon bias of the host cell. The skilled artisan appreciates thelikelihood of successful gene expression if codon usage is biasedtowards those codons favored by the host. Determination of preferredcodons can be based on a survey of genes derived from the host cellwhere sequence information is available.

[0036] “Gene” refers to a nucleic acid fragment that expresses aspecific protein, including regulatory sequences preceding (5′non-coding sequences) and following (3′ non-coding sequences) the codingsequence. “Native gene” refers to a gene as found in nature with its ownregulatory sequences. “Chimeric gene” refers any gene that is not anative gene, comprising regulatory and coding sequences that are notfound together in nature. Accordingly, a chimeric gene may compriseregulatory sequences and coding sequences that are derived fromdifferent sources, or regulatory sequences and coding sequences derivedfrom the same source, but arranged in a manner different than that foundin nature. “Endogenous gene” refers to a native gene in its naturallocation in the genome of an organism. A “foreign gene” refers to a genenot normally found in the host organism, but that is introduced into thehost organism by gene transfer. Foreign genes can comprise native genesinserted into a non-native organism, or chimeric genes. A “transgene” isa gene that has been introduced into the genome by a transformationprocedure.

[0037] “Coding sequence” refers to a nucleotide sequence that codes fora specific amino acid sequence. “Regulatory sequences” refer tonucleotide sequences located upstream (5′ non-coding sequences), within,or downstream (3′ non-coding sequences) of a coding sequence, and whichinfluence the transcription, RNA processing or stability, or translationof the associated coding sequence. Regulatory sequences may includepromoters, translation leader sequences, introns, and polyadenylationrecognition sequences.

[0038] “Promoter” refers to a nucleotide sequence capable of controllingthe expression of a coding sequence or functional RNA. In general, acoding sequence is located 3′ to a promoter sequence. The promotersequence consists of proximal and more distal upstream elements, thelatter elements often referred to as enhancers. Accordingly, an“enhancer” is a nucleotide sequence which can stimulate promoteractivity and may be an innate element of the promoter or a heterologouselement inserted to enhance the level or tissue-specificity of apromoter. Promoters may be derived in their entirety from a native gene,or may be composed of different elements derived from differentpromoters found in nature, or may even comprise synthetic nucleotidesegments. It is understood by those skilled in the art that differentpromoters may direct the expression of a gene in different tissues orcell types, or at different stages of development, or in response todifferent environmental conditions. Promoters which cause a nucleic acidfragment to be expressed in most cell types at most times are commonlyreferred to as “constitutive promoters”. New promoters of various typesuseful in plant cells are constantly being discovered; numerous examplesmay be found in the compilation by Okamuro and Goldberg (1989)Biochemistry of Plants 15:1-82. It is further recognized that since inmost cases the exact boundaries of regulatory sequences have not beencompletely defined, nucleic acid fragments of different lengths may haveidentical promoter activity.

[0039] “Translation leader sequence” refers to a nucleotide sequencelocated between the promoter sequence of a gene and the coding sequence.The translation leader sequence is present in the fully processed mRNAupstream of the translation start sequence. The translation leadersequence may affect processing of the primary transcript to mRNA, mRNAstability or translation efficiency. Examples of translation leadersequences have been described (Turner and Foster (1995) Mol. Biotechnol.3:225-236).

[0040] “3′ Non-coding sequences” refers to nucleotide sequences locateddownstream of a coding sequence and includes polyadenylation recognitionsequences and other sequences encoding regulatory signals capable ofaffecting mRNA processing or gene expression. The polyadenylation signalis usually characterized by affecting the addition of polyadenylic acidtracts to the 3′ end of the mRNA precursor. The use of different 3′non-coding sequences is exemplified by Ingelbrecht et al. (1989) PlantCell 1:671-680.

[0041] “RNA transcript” refers to the product resulting from RNApolymerase-catalyzed transcription of a DNA sequence. When the RNAtranscript is a perfect complementary copy of the DNA sequence, it isreferred to as the primary transcript or it may be a RNA sequencederived from posttranscriptional processing of the primary transcriptand is referred to as the mature RNA. “Messenger RNA (mRNA)” refers tothe RNA that is without introns and can be translated into polypeptidesby the cell. “cDNA” refers to DNA that is complementary to and derivedfrom an mRNA template. The cDNA can be single-stranded or converted todouble stranded form using, for example, the Klenow fragment of DNApolymerase I. “Sense RNA” refers to an RNA transcript that includes themRNA and can be translated into a polypeptide by the cell. “AntisenseRNA” refers to an RNA transcript that is complementary to all or part ofa target primary transcript or mRNA and that blocks the expression of atarget gene (see U.S. Pat. No. 5,107,065, incorporated herein byreference). The complementarity of an antisense RNA may be with any partof the specific nucleotide sequence, i.e., at the 5′ non-codingsequence, 3′ non-coding sequence, introns, or the coding sequence.“Functional RNA” refers to sense RNA, antisense RNA, ribozyme RNA, orother RNA that may not be translated but yet has an effect on cellularprocesses.

[0042] The term “operably linked” refers to the association of two ormore nucleic acid fragments so that the function of one is affected bythe other. For example, a promoter is operably linked with a codingsequence when it is capable of affecting the expression of that codingsequence (i.e., that the coding sequence is under the transcriptionalcontrol of the promoter). Coding sequences can be operably linked toregulatory sequences in sense or antisense orientation.

[0043] The term “expression”, as used herein, refers to thetranscription and stable accumulation of sense (mRNA) or antisense RNAderived from the nucleic acid fragment of the invention. “Expression”may also refer to translation of mRNA into a polypeptide. “Antisenseinhibition” refers to the production of antisense RNA transcriptscapable of suppressing the expression of the target protein.“Overexpression” refers to the production of a gene product intransgenic organisms that exceeds levels of production in normal ornon-transformed organisms. “Co-suppression” refers to the production ofsense RNA transcripts capable of suppressing the expression of identicalor substantially similar foreign or endogenous genes (U.S. Pat. No.5,231,020, incorporated herein by reference).

[0044] A “protein” or “polypeptide” is a chain of amino acids arrangedin a specific order determined by the coding sequence in apolynucleotide encoding the polypeptide. Each protein or polypeptide hasa unique function.

[0045] “Altered levels” or “altered expression” refer to the productionof gene product(s) in transgenic organisms in amounts or proportionsthat differ from that of normal or non-transformed organisms.

[0046] “Null mutant” refers to a host cell which either lacks theexpression of a certain polypeptide or expresses a polypeptide which isinactive or does not have any detectable expected enzymatic function.

[0047] “Mature protein” or the term “mature” when used in describing aprotein refers to a post-translationally processed polypeptide; i.e.,one from which any pre- or propeptides present in the primarytranslation product have been removed. “Precursor protein” or the term“precursor” when used in describing a protein refers to the primaryproduct of translation of mRNA; i.e., with pre- and propeptides stillpresent. Pre- and propeptides may be but are not limited tointracellular localization signals.

[0048] A “chloroplast transit peptide” is an amino acid sequence whichis translated in conjunction with a protein and directs the protein tothe chloroplast or other plastid types present in the cell in which theprotein is made. “Chloroplast transit sequence” refers to a nucleotidesequence that encodes a chloroplast transit peptide. A “signal peptide”is an amino acid sequence which is translated in conjunction with aprotein and directs the protein to the secretory system (Chrispeels(1991) Ann. Rev. Plant Phys. Plant Mol. Biol. 42:21-53). If the proteinis to be directed to a vacuole, a vacuolar targeting signal (supra) canfurther be added, or if to the endoplasmic reticulum, an endoplasmicreticulum retention signal (supra) may be added. If the protein is to bedirected to the nucleus, any signal peptide present should be removedand instead a nuclear localization signal included (Raikhel (1992) PlantPhys. 100:1627-1632).

[0049] “Transformation” refers to the transfer of a nucleic acidfragment into the genome of a host organism, resulting in geneticallystable inheritance. Host organisms containing the transformed nucleicacid fragments are referred to as “transgenic” organisms. Examples ofmethods of plant transformation include Agrobacterium-mediatedtransformation (De Blaere et al. (1987) Meth. Enzymol. 143:277) andparticle-accelerated or “gene gun” transformation technology (Klein etal. (1987) Nature (London) 327:70-73; U.S. Pat. No. 4,945,050,incorporated herein by reference). Thus, isolated polynucleotides of thepresent invention can be incorporated into recombinant constructs,typically DNA constructs, capable of introduction into and replicationin a host cell. Such a construct can be a vector that includes areplication system and sequences that are capable of transcription andtranslation of a polypeptide-encoding sequence in a given host cell. Anumber of vectors suitable for stable transfection of plant cells or forthe establishment of transgenic plants have been described in, e.g.,Pouwels et al., Cloning Vectors: A Laboratory Manual, 1985, supp. 1987;Weissbach and Weissbach, Methods for Plant Molecular Biology, AcademicPress, 1989; and Flevin et al., Plant Molecular Biology Manual, KluwerAcademic Publishers, 1990. Typically, plant expression vectors include,for example, one or more cloned plant genes under the transcriptionalcontrol of 5′ and 3′ regulatory sequences and a dominant selectablemarker. Such plant expression vectors also can contain a promoterregulatory region (e.g., a regulatory region controlling inducible orconstitutive, environmentally- or developmentally-regulated, or cell- ortissue-specific expression), a transcription initiation start site, aribosome binding site, an RNA processing signal, a transcriptiontermination site, and/or a polyadenylation signal.

[0050] Standard recombinant DNA and molecular cloning techniques usedherein are well known in the art and are described more fully inSambrook et al., Molecular Cloning: A Laboratory Manual; Cold SpringHarbor Laboratory Press: Cold Spring Harbor, 1989 (hereinafter“Maniatis”).

[0051] “PCR” or “polymerase chain reaction” is well known by thoseskilled in the art as a technique used for the amplification of specificDNA segments (U.S. Pat. Nos. 4,683,195 and 4,800,159).

[0052] The present invention concerns an isolated polynucleotidecomprising a nucleotide sequence selected from the group consisting of:(a) a first nucleotide sequence comprising a polynucleotide of at least700 nucleotides from SEQ ID NO:1; (b) a second nucleotide sequencecomprising a polynucleotide of at least 420 nucleotides from the groupconsisting of SEQ ID NOs:3, 5, 10, and 12; (c) a third nucleotidesequence encoding a polypeptide of at least 100 amino acids having atleast 80% identity based on the Clustal method of alignment whencompared to a polypeptide selected from the group consisting of SEQ IDNOs:2, 4, 6, 11, and 13; and (d) a fourth nucleotide sequence comprisingthe complement of the first or second nucleotide sequences.

[0053] Nucleic acid fragments encoding at least a substantial portion ofseveral novel plant tryptophan synthase beta subunits have been isolatedand identified by comparison of random plant cDNA sequences to publicdatabases containing nucleotide and protein sequences using the BLASTalgorithms well known to those skilled in the art. The nucleic acidfragments of the instant invention may be used to isolate cDNAs andgenes encoding homologous proteins from the same or other plant species.Isolation of homologous genes using sequence-dependent protocols is wellknown in the art. Examples of sequence-dependent protocols include, butare not limited to, methods of nucleic acid hybridization, and methodsof DNA and RNA amplification as exemplified by various uses of nucleicacid amplification technologies (e.g., polymerase chain reaction, ligasechain reaction).

[0054] For example, genes encoding other novel plant tryptophan synthasebeta subunits, either as cDNAs or genomic DNAs, could be isolateddirectly by using all or a substantial portion of the instant nucleicacid fragments as DNA hybridization probes to screen libraries from anydesired plant employing methodology well known to those skilled in theart. Specific oligonucleotide probes based upon the instant nucleic acidsequences can be designed and synthesized by methods known in the art(Maniatis). Moreover, an entire sequence(s) can be used directly tosynthesize DNA probes by methods known to the skilled artisan such asrandom primer DNA labeling, nick translation, end-labeling techniques,or RNA probes using available in vitro transcription systems. Inaddition, specific primers can be designed and used to amplify a part orall of the instant sequences. The resulting amplification products canbe labeled directly during amplification reactions or labeled afteramplification reactions, and used as probes to isolate full length cDNAor genomic fragments under conditions of appropriate stringency.

[0055] In addition, two short segments of the instant nucleic acidfragments may be used in polymerase chain reaction protocols to amplifylonger nucleic acid fragments encoding homologous genes from DNA or RNA.The polymerase chain reaction may also be performed on a library ofcloned nucleic acid fragments wherein the sequence of one primer isderived from the instant nucleic acid fragments, and the sequence of theother primer takes advantage of the presence of the polyadenylic acidtracts to the 3′ end of the mRNA precursor encoding plant genes.Alternatively, the second primer sequence may be based upon sequencesderived from the cloning vector. For example, the skilled artisan canfollow the RACE protocol (Frohman et al. (1988) Proc. Natl. Acad. Sci.USA 85:8998-9002) to generate cDNAs by using PCR to amplify copies ofthe region between a single point in the transcript and the 3′ or 5′end. Primers oriented in the 3′ and 5′ directions can be designed fromthe instant sequences. Using commercially available 3′ RACE or 5′ RACEsystems (BRL), specific 3′ or 5′ cDNA fragments can be isolated (Oharaet al. (1989) Proc. Natl. Acad. Sci. USA 86:5673-5677; Loh et al. (1989)Science 243:217-220). Products generated by the 3′ and 5′ RACEprocedures can be combined to generate full-length cDNAs (Frohman andMartin (1989) Techniques 1:165). Consequently, a polynucleotidecomprising a nucleotide sequence of at least one of 60 (preferably oneof at least 40, most preferably one of at least 30) contiguousnucleotides derived from a nucleotide sequence selected from the groupconsisting of SEQ ID NOs:1, 3, 5, 10, and 12 and the complement of suchnucleotide sequences may be used in such methods to obtain a nucleicacid fragment encoding a substantial portion of an amino acid sequenceof a polypeptide.

[0056] The present invention relates to a method of obtaining a nucleicacid fragment encoding a substantial portion of a novel plant tryptophansynthase beta subunit polypeptide, preferably a substantial portion of aplant novel plant tryptophan synthase beta subunit polypeptide,comprising the steps of: synthesizing an oligonucleotide primercomprising a nucleotide sequence of at least one of 60 (preferably atleast one of 40, most preferably at least one of 30) contiguousnucleotides derived from a nucleotide sequence selected from the groupconsisting of SEQ ID NOs:1, 3, 5, 10, and 12, and the complement of suchnucleotide sequences; and amplifying a nucleic acid fragment (preferablya cDNA inserted in a cloning vector) using the oligonucleotide primer.The amplified nucleic acid fragment preferably will encode a substantialportion of a novel plant tryptophan synthase beta subunit polypeptide.

[0057] Availability of the instant nucleotide and deduced amino acidsequences facilitates immunological screening of cDNA expressionlibraries. Synthetic peptides representing substantial portions of theinstant amino acid sequences may be synthesized. These peptides can beused to immunize animals to produce polyclonal or monoclonal antibodieswith specificity for peptides or proteins comprising the amino acidsequences. These antibodies can be then be used to screen cDNAexpression libraries to isolate full-length cDNA clones of interest(Lerner (1984) Adv. Immunol. 36:1-34; Maniatis).

[0058] In another embodiment, this invention concerns viruses and hostcells comprising either the chimeric genes of the invention as describedherein or an isolated polynucleotide of the invention as describedherein. Examples of host cells which can be used to practice theinvention include, but are not limited to, yeast, bacteria, and plants.

[0059] As was noted above, the nucleic acid fragments of the instantinvention may be used to create transgenic plants in which the disclosedpolypeptides are present at higher or lower levels than normal or incell types or developmental stages in which they are not normally found.This would have the effect of altering the level of tryptophan in thosecells. The tryptophan synthase beta subunit sequences included in thisapplication are similar to bacterial tryptophan synthase beta subunitgenes but distantly related to the previously known plant tryptophansynthase beta subunit genes. Inactivation by mutation of the twopreviously known corn tryptophan synthase beta subunit genes causes cornto require tryptophan for growth. Inactivation by mutation of one of thepreviously known Arabidopsis tryptophan synthase beta subunit genescauses tryptophan auxotrophy suggesting the presence of other tryptophansynthase beta subunits in plants. Inhibition of this enzyme may lead tocell death because the substrate of the enzyme, indole, is toxic.Manipulation of the levels of this tryptophan synthase beta subunit maylead to the production of higher levels of tryptophan.

[0060] Overexpression of the proteins of the instant invention may beaccomplished by first constructing a chimeric gene in which the codingregion is operably linked to a promoter capable of directing expressionof a gene in the desired tissues at the desired stage of development.The chimeric gene may comprise promoter sequences and translation leadersequences derived from the same genes. 3′ Non-coding sequences encodingtranscription termination signals may also be provided. The instantchimeric gene may also comprise one or more introns in order tofacilitate gene expression.

[0061] Plasmid vectors comprising the instant isolated polynucleotide(or chimeric gene) may be constructed. The choice of plasmid vector isdependent upon the method that will be used to transform host plants.The skilled artisan is well aware of the genetic elements that must bepresent on the plasmid vector in order to successfully transform, selectand propagate host cells containing the chimeric gene. The skilledartisan will also recognize that different independent transformationevents will result in different levels and patterns of expression (Joneset al. (1985) EMBO J. 4:2411-2418; De Almeida et al. (1989) Mol. Gen.Genetics 218:78-86), and thus that multiple events must be screened inorder to obtain lines displaying the desired expression level andpattern. Such screening may be accomplished by Southern analysis of DNA,Northern analysis of mRNA expression, Western analysis of proteinexpression, or phenotypic analysis.

[0062] For some applications it may be useful to direct the instantpolypeptides to different cellular compartments, or to facilitate theirsecretion from the cell. It is thus envisioned that the chimeric genedescribed above may be further supplemented by directing the codingsequence to encode the instant polypeptides with appropriateintracellular targeting sequences such as transit sequences (Keegstra(1989) Cell 56:247-253), signal sequences or sequences encodingendoplasmic reticulum localization (Chrispeels (1991) Ann. Rev. PlantPhys. Plant Mol. Biol. 42:21-53), or nuclear localization signals(Raikhel (1992) Plant Phys. 100:1627-1632) with or without removingtargeting sequences that are already present. While the references citedgive examples of each of these, the list is not exhaustive and moretargeting signals of use may be discovered in the future.

[0063] In another embodiment, the present invention concerns apolypeptide of at least 100 amino acids that has at least 80% identitybased on the Clustal method of alignment when compared to a polypeptideselected from the group consisting of SEQ ID NOs:2, 4, 6, 11, and 13.

[0064] The instant polypeptides (or substantial portions thereof) may beproduced in heterologous host cells, particularly in the cells ofmicrobial hosts, and can be used to prepare antibodies to these proteinsby methods well known to those skilled in the art. The antibodies areuseful for detecting the polypeptides of the instant invention in situin cells or in vitro in cell extracts. Preferred heterologous host cellsfor production of the instant polypeptides are microbial hosts.Microbial expression systems and expression vectors containingregulatory sequences that direct high level expression of foreignproteins are well known to those skilled in the art. Any of these couldbe used to construct a chimeric gene for production of the instantpolypeptides. This chimeric gene could then be introduced intoappropriate microorganisms via transformation to provide high levelexpression of the encoded novel plant tryptophan synthase beta subunit.An example of a vector for high level expression of the instantpolypeptides in a bacterial host is provided (Example 6).

[0065] Additionally, the instant polypeptides can be used as targets tofacilitate design and/or identification of inhibitors of those enzymesthat may be useful as herbicides. This is desirable because thetryptophan synthase beta subunit described herein catalyzes the laststep in tryptophan synthesis from chorismic acid. Accordingly,inhibition of the activity of the enzyme described herein could lead toinhibition plant growth. Thus, the instant tryptophan synthase betasubunit could be appropriate for new herbicide discovery and design.

[0066] All or a substantial portion of the polynucleotides of theinstant invention may also be used as probes for genetically andphysically mapping the genes that they are a part of, and used asmarkers for traits linked to those genes. Such information may be usefulin plant breeding in order to develop lines with desired phenotypes. Forexample, the instant nucleic acid fragments may be used as restrictionfragment length polymorphism (RFLP) markers. Southern blots (Maniatis)of restriction-digested plant genomic DNA may be probed with the nucleicacid fragments of the instant invention. The resulting banding patternsmay then be subjected to genetic analyses using computer programs suchas MapMaker (Lander et al. (1987) Genomics 1:174-181) in order toconstruct a genetic map. In addition, the nucleic acid fragments of theinstant invention may be used to probe Southern blots containingrestriction endonuclease-treated genomic DNAs of a set of individualsrepresenting parent and progeny of a defined genetic cross. Segregationof the DNA polymorphisms is noted and used to calculate the position ofthe instant nucleic acid sequence in the genetic map previously obtainedusing this population (Botstein et al. (1980) Am. J. Hum. Genet.32:314-331).

[0067] The production and use of plant gene-derived probes for use ingenetic mapping is described in Bernatzky and Tanksley (1986) Plant Mol.Biol. Reporter 4:37-41. Numerous publications describe genetic mappingof specific cDNA clones using the methodology outlined above orvariations thereof. For example, F2 intercross populations, backcrosspopulations, randomly mated populations, near isogenic lines, and othersets of individuals may be used for mapping. Such methodologies are wellknown to those skilled in the art.

[0068] Nucleic acid probes derived from the instant nucleic acidsequences may also be used for physical mapping (i.e., placement ofsequences on physical maps; see Hoheisel et al., In: NonmammalianGenomic Analysis: A Practical Guide, Academic press 1996, pp. 319-346,and references cited therein).

[0069] In another embodiment, nucleic acid probes derived from theinstant nucleic acid sequences may be used in direct fluorescence insitu hybridization (FISH) mapping (Trask (1991) Trends Genet.7:149-154). Although current methods of FISH mapping favor use of largeclones (several to several hundred KB; see Laan et al. (1995) GenomeRes. 5:13-20), improvements in sensitivity may allow performance of FISHmapping using shorter probes.

[0070] A variety of nucleic acid amplification-based methods of geneticand physical mapping may be carried out using the instant nucleic acidsequences. Examples include allele-specific amplification (Kazazian(1989) J. Lab. Clin. Med. 11:95-96), polymorphism of PCR-amplifiedfragments (CAPS; Sheffield et al. (1993) Genomics 16:325-332),allele-specific ligation (Landegren et al. (1988) Science241:1077-1080), nucleotide extension reactions (Sokolov (1990) NucleicAcid Res. 18:3671), Radiation Hybrid Mapping (Walter et al. (1997) Nat.Genet. 7:22-28) and Happy Mapping (Dear and Cook (1989) Nucleic AcidRes. 17:6795-6807). For these methods, the sequence of a nucleic acidfragment is used to design and produce primer pairs for use in theamplification reaction or in primer extension reactions. The design ofsuch primers is well known to those skilled in the art. In methodsemploying PCR-based genetic mapping, it may be necessary to identify DNAsequence differences between the parents of the mapping cross in theregion corresponding to the instant nucleic acid sequence. This,however, is generally not necessary for mapping methods.

EXAMPLES

[0071] The present invention is further defined in the followingExamples, in which parts and percentages are by weight and degrees areCelsius, unless otherwise stated. It should be understood that theseExamples, while indicating preferred embodiments of the invention, aregiven by way of illustration only. From the above discussion and theseExamples, one skilled in the art can ascertain the essentialcharacteristics of this invention, and without departing from the spiritand scope thereof, can make various changes and modifications of theinvention to adapt it to various usages and conditions. Thus, variousmodifications of the invention in addition to those shown and describedherein will be apparent to those skilled in the art from the foregoingdescription. Such modifications are also intended to fall within thescope of the appended claims.

[0072] The disclosure of each reference set forth herein is incorporatedherein by reference in its entirety.

EXAMPLE 1 Composition of cDNA Libraries; Isolation and Sequencing ofcDNA Clones

[0073] cDNA libraries representing mRNAs from various corn, rice,soybean, and wheat tissues were prepared. The characteristics of thelibraries are described below. TABLE 2 cDNA Libraries from Corn, Rice,Soybean, and Wheat Library Tissue Clone ccase-b Corn Callus Type IITissue, Somatic ccase-b.pk0015.e7 Embryo Formed, Highly Transformablesfl1 Soybean Immature Flower sfl1.pk131.b23 rlr12 Rice Leaf 15 DaysAfter Germination, 12 rlr12.pk0007.g7 Hours After Infection of StrainMagaporthe grisea 4360-R-62 (AVR2- YAMO); Resistant rdr1f RiceDeveloping Root of 10 Day Old rdr1f.pk003.m17 Plant wdk2c WheatDeveloping Kernel, 7 Days After wdk2c.pk005.o10 Anthesis

[0074] cDNA libraries may be prepared by any one of many methodsavailable. For example, the cDNAs may be introduced into plasmid vectorsby first preparing the cDNA libraries in Uni-ZAP™ XR vectors accordingto the manufacturer's protocol (Stratagene Cloning Systems, La Jolla,Calif.). The Uni-ZAP™ XR libraries are converted into plasmid librariesaccording to the protocol provided by Stratagene. Upon conversion, cDNAinserts will be contained in the plasmid vector pBluescript. Inaddition, the cDNAs may be introduced directly into precut Bluescript IISK(+) vectors (Stratagene) using T4 DNA ligase (New England Biolabs),followed by transfection into DH10B cells according to themanufacturer's protocol (GIBCO BRL Products). Once the cDNA inserts arein plasmid vectors, plasmid DNAs are prepared from randomly pickedbacterial colonies containing recombinant pBluescript plasmids, or theinsert cDNA sequences are amplified via polymerase chain reaction usingprimers specific for vector sequences flanking the inserted cDNAsequences. Amplified insert DNAs or plasmid DNAs are sequenced indye-primer sequencing reactions to generate partial cDNA sequences(expressed sequence tags or “ESTs”; see Adams et al., (1991) Science252:1651-1656). The resulting ESTs are analyzed using a Perkin ElmerModel 377 fluorescent sequencer.

EXAMPLE 2 Identification of cDNA Clones

[0075] cDNA clones encoding novel plant tryptophan synthase betasubunits were identified by conducting BLAST (Basic Local AlignmentSearch Tool; Altschul et al. (1993) J. Mol. Biol. 215:403-410; see alsowww.ncbi.nlm.nih.gov/BLAST/) searches for similarity to sequencescontained in the BLAST “nr” database (comprising all non-redundantGenBank CDS translations, sequences derived from the 3-dimensionalstructure Brookhaven Protein Data Bank, the last major release of theSWISS-PROT protein sequence database, EMBL, and DDBJ databases). ThecDNA sequences obtained in Example 1 were analyzed for similarity to allpublicly available DNA sequences contained in the “nr” database usingthe BLASTN algorithm provided by the National Center for BiotechnologyInformation (NCBI). The DNA sequences were translated in all readingframes and compared for similarity to all publicly available proteinsequences contained in the “nr” database using the BLASTX algorithm(Gish and States (1993) Nat. Genet. 3:266-272) provided by the NCBI. Forconvenience, the P-value (probability) of observing a match of a cDNAsequence to a sequence contained in the searched databases merely bychance as calculated by BLAST are reported herein as “pLog” values,which represent the negative of the logarithm of the reported P-value.Accordingly, the greater the pLog value, the greater the likelihood thatthe cDNA sequence and the BLAST “hit” represent homologous proteins.

EXAMPLE 3 Characterization of cDNA Clones Encoding Novel TryptophanSynthase Beta Subunits

[0076] The BLASTX search using the EST sequences from clones listed inTable 3 revealed similarity of the polypeptides encoded by the cDNAs totryptophan synthase beta subunit from Aquifex aeolicus (NCBI GeneralIdentifier No. 2983814) or Archaeoglobus fulgidus (NCBI GeneralIdentifier No. 2649345). Shown in Table 3 are the BLAST results forindividual ESTs (“EST”), or for the sequences of the entire cDNA insertscomprising the indicated cDNA clones encoding the entire protein(“CGS”): TABLE 3 BLAST Results for Sequences Encoding PolypeptidesHomologous to Tryptophan Synthase Beta Subunit BLAST pLog Score BLASTNCBI pLog Clone Status Organism GI No. Score ccase-b.pk0015.e7 CGSAquifex aeolicus 2983814 >254 sfl1.pk131.b23 CGS Aquifex aeolicus2983814 180.0 wdk2c.pk005.o10 EST Archaeoglobus 2649345 8.22 fulgidus

[0077] The sequence of the entire cDNA insert in clone wdk2c.pk005.o10was determined and further sequencing and searching of the Du Pontproprietary EST Database allowed the identification of rice ESTsencoding tryptophan synthase beta subunit homologs. The BLASTX searchusing the EST sequences from clones listed in Table 4 revealedsimilarity of the polypeptides encoded by the cDNAs to tryptophansynthase beta subunit from Aquifex aeolicus (NCBI General Identifier No.7437010) or Archaeoglobus fulgidus (NCBI General Identifier No. 7437017)The amino acid sequence having NCBI General Identifier No. 7437010 is100% identical to the amino acid sequence having NCBI General IdentifierNo. 2983814 and the amino acid sequence having NCBI General IdentifierNo. 7437017 is 100% identical to the amino acid sequence having NCBIGeneral Identifier No. 2649345. Shown in Table 4 are the BLAST resultsfor the sequences of contigs assembled from two or more ESTs (“Contig”)or the sequences of the entire cDNA inserts comprising the indicatedcDNA clones encoding an entire protein (“CGS”): TABLE 4 BLAST Resultsfor Sequences Encoding Polypeptides Homologous to Tryptophan SynthaseBeta Subunit BLAST NCBI pLog Clone Status Organism GI No. Score Contigof: Contig Archaeoglobus 7437017 28.70 rdr1f.pk003.m17 fulgidusrlr12.pk0007.g7 wdk2c.pk005.o10:fis CGS Aquifex aeolicus 7437010 156.00

[0078]FIG. 1 presents an alignment of the amino acid sequences set forthin SEQ ID NOs:2, 4, and 13 and the Aquifex aeolicus sequence (NCBIGeneral Identifier No. 29831814, SEQ ID NO:7). The data in Table 5presents a calculation of the percent identity of the amino acidsequences set forth in SEQ ID NOs:2, 4, 6, 11, and 13 and the Aquifexaeolicus sequence (SEQ ID NO:7). TABLE 5 Percent Identity of Amino AcidSequences Deduced From the Nucleotide Sequences of cDNA Clones EncodingPolypeptides Homologous to Tryptophan Synthase Beta Subunit PercentIdentity to SEQ ID NO. 29831814 2 61.3 4 59.2 6 17.7 11  32.6 13  59.9

[0079] The corn amino acid sequence is 71.8% identical to the soybeansequence and 82.8% identical to the wheat sequence encoding an entiretryptophan synthase beta synthase while the soybean sequence is 72.1%identical to the wheat sequence.

[0080] There are amino acid sequences for two corn tryptophan synthasebeta subunits in the NCBI database. The sequence having GeneralIdentifier No. 168572 lacks 54 N-terminal amino acids present in thesequence having General Identifier No. 168574 and there are 9differences in the rest of the amino acid sequence making thesesequences 97.7% identical to each other. FIG. 2 presents an alignment ofthe amino acid sequences set forth in SEQ ID NO:2 and the corntryptophan synthase beta subunits present in the NCBI database (GeneralIdentifier No. 168572, SEQ ID NO:8, and General Identifier No. 168574,SEQ ID NO:9). The sequence from clone ccase-b.pk0015.e7 (SEQ ID NO:2) is22.1% identical to SEQ ID NO:8 and 21.0% identical to SEQ ID NO:9.

[0081] Sequence alignments and percent identity calculations wereperformed using the Megalign program of the LASERGENE bioinformaticscomputing suite (DNASTAR Inc., Madison, Wis.). Multiple alignment of thesequences was performed using the Clustal method of alignment (Higginsand Sharp (1989) CABIOS. 5:151-153) with the default parameters (GAPPENALTY=10, GAP LENGTH PENALTY=10). Default parameters for pairwisealignments using the Clustal method were KTUPLE 1, GAP PENALTY=3,WINDOW=5 and DIAGONALS SAVED=5. Sequence alignments, BLAST scores andprobabilities indicate that the nucleic acid fragments comprising theinstant cDNA clones encode entire or nearly entire corn, soybean, andwheat novel tryptophan synthase beta subunits and a substantial portionof a wheat and a rice novel tryptophan synthase beta subunits.

EXAMPLE 4 Expression of Chimeric Genes in Monocot Cells

[0082] A chimeric gene comprising a cDNA encoding the instantpolypeptides in sense orientation with respect to the maize 27 kD zeinpromoter that is located 5′ to the cDNA fragment, and the 10 kD zein 3′end that is located 3′ to the cDNA fragment, can be constructed. ThecDNA fragment of this gene may be generated by polymerase chain reaction(PCR) of the cDNA clone using appropriate oligonucleotide primers.Cloning sites (NcoI or SmaI) can be incorporated into theoligonucleotides to provide proper orientation of the DNA fragment wheninserted into the digested vector pML103 as described below.Amplification is then performed in a standard PCR. The amplified DNA isthen digested with restriction enzymes NcoI and SmaI and fractionated onan agarose gel. The appropriate band can be isolated from the gel andcombined with a 4.9 kb NcoI-SmaI fragment of the plasmid pML103. PlasmidpML03 has been deposited under the terms of the Budapest Treaty at ATCC(American Type Culture Collection, 10801 University Blvd., Manassas, Va.20110-2209), and bears accession number ATCC 97366. The DNA segment frompML103 contains a 1.05 kb SalI-NcoI promoter fragment of the maize 27 kDzein gene and a 0.96 kb SmaI-SalI fragment from the 3′ end of the maize10 kD zein gene in the vector pGem9Zf(+) (Promega). Vector and insertDNA can be ligated at 15° C. overnight, essentially as described(Maniatis). The ligated DNA may then be used to transform E. coliXL1-Blue (Epicurian Coli XL-1 Blue™; Stratagene). Bacterialtransformants can be screened by restriction enzyme digestion of plasmidDNA and limited nucleotide sequence analysis using the dideoxy chaintermination method (Sequenase™ DNA Sequencing Kit; U.S. Biochemical).The resulting plasmid construct would comprise a chimeric gene encoding,in the 5′ to 3′ direction, the maize 27 kD zein promoter, a cDNAfragment encoding the instant polypeptides, and the 10 kD zein 3′region.

[0083] The chimeric gene described above can then be introduced intocorn cells by the following procedure. Immature corn embryos can bedissected from developing caryopses derived from crosses of the inbredcorn lines H99 and LH132. The embryos are isolated 10 to 11 days afterpollination when they are 1.0 to 1.5 mm long. The embryos are thenplaced with the axis-side facing down and in contact withagarose-solidified N6 medium (Chu et al. (1975) Sci. Sin. Peking18:659-668). The embryos are kept in the dark at 27° C. Friableembryogenic callus consisting of undifferentiated masses of cells withsomatic proembryoids and embryoids borne on suspensor structuresproliferates from the scutellum of these immature embryos. Theembryogenic callus isolated from the primary explant can be cultured onN6 medium and sub-cultured on this medium every 2 to 3 weeks.

[0084] The plasmid, p35S/Ac (obtained from Dr. Peter Eckes, Hoechst Ag,Frankfurt, Germany) may be used in transformation experiments in orderto provide for a selectable marker. This plasmid contains the Pat gene(see European Patent Publication 0 242 236) which encodesphosphinothricin acetyl transferase (PAT). The enzyme PAT confersresistance to herbicidal glutamine synthetase inhibitors such asphosphinothricin. The pat gene in p35S/Ac is under the control of the35S promoter from Cauliflower Mosaic Virus (Odell et al. (1985) Nature313:810-812) and the 3′ region of the nopaline synthase gene from theT-DNA of the Ti plasmid of Agrobacterium tumefaciens.

[0085] The particle bombardment method (Klein et al. (1987) Nature327:70-73) may be used to transfer genes to the callus culture cells.According to this method, gold particles (1 μm in diameter) are coatedwith DNA using the following technique. Ten μg of plasmid DNAs are addedto 50 μL of a suspension of gold particles (60 mg per mL). Calciumchloride (50 μL of a 2.5 M solution) and spermidine free base (20 μL ofa 1.0 M solution) are added to the particles. The suspension is vortexedduring the addition of these solutions. After 10 minutes, the tubes arebriefly centrifuged (5 sec at 15,000 rpm) and the supernatant removed.The particles are resuspended in 200 μL of absolute ethanol, centrifugedagain and the supernatant removed. The ethanol rinse is performed againand the particles resuspended in a final volume of 30 μL of ethanol. Analiquot (5 μL) of the DNA-coated gold particles can be placed in thecenter of a Kapton™ flying disc (Bio-Rad Labs). The particles are thenaccelerated into the corn tissue with a Biolistic™ PDS-1000/He (Bio-RadInstruments, Hercules Calif.), using a helium pressure of 1000 psi, agap distance of 0.5 cm and a flying distance of 1.0 cm.

[0086] For bombardment, the embryogenic tissue is placed on filter paperover agarose-solidified N6 medium. The tissue is arranged as a thin lawnand covered a circular area of about 5 cm in diameter. The petri dishcontaining the tissue can be placed in the chamber of the PDS-1000/Heapproximately 8 cm from the stopping screen. The air in the chamber isthen evacuated to a vacuum of 28 inches of mercury (Hg). Themacrocarrier is accelerated with a helium shock wave using a rupturemembrane that bursts when the He pressure in the shock tube reaches 1000psi.

[0087] Seven days after bombardment the tissue can be transferred to N6medium that contains gluphosinate (2 mg per liter) and lacks casein orproline. The tissue continues to grow slowly on this medium. After anadditional 2 weeks the tissue can be transferred to fresh N6 mediumcontaining gluphosinate. After 6 weeks, areas of about 1 cm in diameterof actively growing callus can be identified on some of the platescontaining the glufosinate-supplemented medium. These calli may continueto grow when sub-cultured on the selective medium.

[0088] Plants can be regenerated from the transgenic callus by firsttransferring clusters of tissue to N6 medium supplemented with 0.2 mgper liter of 2,4-D. After two weeks the tissue can be transferred toregeneration medium (Fromm et al. (1990) Bio/Technology 8:833-839).

EXAMPLE 5 Expression of Chimeric Genes in Dicot Cells

[0089] A seed-specific construct composed of the promoter andtranscription terminator from the gene encoding the β subunit of theseed storage protein phaseolin from the bean Phaseolus vulgaris (Doyleet al. (1986) J. Biol. Chem. 261:9228-9238) can be used for expressionof the instant polypeptides in transformed soybean. The phaseolinconstruct includes about 500 nucleotides upstream (5′) from thetranslation initiation codon and about 1650 nucleotides downstream (3′)from the translation stop codon of phaseolin. Between the 5′ and 3′regions are the unique restriction endonuclease sites Nco I (whichincludes the ATG translation initiation codon), Sma I, Kpn I and Xba I.The entire construct is flanked by Hind III sites.

[0090] The cDNA fragment of this gene may be generated by polymerasechain reaction (PCR) of the cDNA clone using appropriate oligonucleotideprimers. Cloning sites can be incorporated into the oligonucleotides toprovide proper orientation of the DNA fragment when inserted into theexpression vector. Amplification is then performed as described above,and the isolated fragment is inserted into a pUC18 vector carrying theseed construct.

[0091] Soybean embryos may then be transformed with the expressionvector comprising sequences encoding the instant polypeptides. To inducesomatic embryos, cotyledons, 3-5 mm in length dissected from surfacesterilized, immature seeds of the soybean cultivar A2872, can becultured in the light or dark at 26° C. on an appropriate agar mediumfor 6-10 weeks. Somatic embryos which produce secondary embryos are thenexcised and placed into a suitable liquid medium. After repeatedselection for clusters of somatic embryos which multiplied as early,globular staged embryos, the suspensions are maintained as describedbelow.

[0092] Soybean embryogenic suspension cultures can be maintained in 35mL liquid media on a rotary shaker, 150 rpm, at 26° C. with florescentlights on a 16:8 hour day/night schedule. Cultures are subcultured everytwo weeks by inoculating approximately 35 mg of tissue into 35 mL ofliquid medium.

[0093] Soybean embryogenic suspension cultures may then be transformedby the method of particle gun bombardment (Klein et al. (1987) Nature(London) 327:70-73, U.S. Pat. No. 4,945,050). A DuPont Biolistic™PDS1000/HE instrument (helium retrofit) can be used for thesetransformations.

[0094] A selectable marker gene which can be used to facilitate soybeantransformation is a chimeric gene composed of the 35S promoter fromCauliflower Mosaic Virus (Odell et al. (1985) Nature 313:810-812), thehygromycin phosphotransferase gene from plasmid pJR225 (from E. coli;Gritz et al.(1983) Gene 25:179-188) and the 3′ region of the nopalinesynthase gene from the T-DNA of the Ti plasmid of Agrobacteriumtumefaciens. The seed construct comprising the phaseolin 5′ region, thefragment encoding the instant polypeptides and the phaseolin 3′ regioncan be isolated as a restriction fragment. This fragment can then beinserted into a unique restriction site of the vector carrying themarker gene.

[0095] To 50 μL of a 60 mg/mL 1 μm gold particle suspension is added (inorder): 5 μL DNA (1 μg/μL), 20 μL spermidine (0.1 M), and 50 μL CaCl₂(2.5 M). The particle preparation is then agitated for three minutes,spun in a microfuge for 10 seconds and the supernatant removed. TheDNA-coated particles are then washed once in 400 μL 70% ethanol andresuspended in 40 μL of anhydrous ethanol. The DNA/particle suspensioncan be sonicated three times for one second each. Five μL of theDNA-coated gold particles are then loaded on each macro carrier disk.

[0096] Approximately 300-400 mg of a two-week-old suspension culture isplaced in an empty 60×15 mm petri dish and the residual liquid removedfrom the tissue with a pipette. For each transformation experiment,approximately 5-10 plates of tissue are normally bombarded. Membranerupture pressure is set at 1100 psi and the chamber is evacuated to avacuum of 28 inches of mercury (Hg). The tissue is placed approximately3.5 inches away from the retaining screen and bombarded three times.Following bombardment, the tissue can be divided in half and placed backinto liquid and cultured as described above.

[0097] Five to seven days post bombardment, the liquid media may beexchanged with fresh media, and eleven to twelve days post bombardmentwith fresh media containing 50 mg/mL hygromycin. This selective mediacan be refreshed weekly. Seven to eight weeks post bombardment, green,transformed tissue may be observed growing from untransformed, necroticembryogenic clusters. Isolated green tissue is removed and inoculatedinto individual flasks to generate new, clonally propagated, transformedembryogenic suspension cultures. Each new line may be treated as anindependent transformation event. These suspensions can then besubcultured and maintained as clusters of immature embryos orregenerated into whole plants by maturation and germination ofindividual somatic embryos.

EXAMPLE 6 Expression of Chimeric Genes in Microbial Cells

[0098] The cDNAs encoding the instant polypeptides can be inserted intothe T7 E. coli expression vector pBT430. This vector is a derivative ofpET-3a (Rosenberg et al. (1987) Gene 56:125-135) which employs thebacteriophage T7 RNA polymerase/T7 promoter system. Plasmid pBT430 wasconstructed by first destroying the EcoR I and Hind III sites in pET-3aat their original positions. An oligonucleotide adaptor containing EcoRI and Hind III sites was inserted at the BamH I site of pET-3a. Thiscreated pET-3aM with additional unique cloning sites for insertion ofgenes into the expression vector. Then, the Nde I site at the positionof translation initiation was converted to an Nco I site usingoligonucleotide-directed mutagenesis. The DNA sequence of pET-3aM inthis region, 5′-CATATGG, was converted to 5′-CCCATGG in pBT430.

[0099] Plasmid DNA containing a cDNA may be appropriately digested torelease a nucleic acid fragment encoding the protein. This fragment maythen be purified on a 1% NuSieve GTG™ low melting agarose gel (FMC).Buffer and agarose contain 10 μg/mL ethidium bromide for visualizationof the DNA fragment. The fragment can then be purified from the agarosegel by digestion with GELase™ (Epicentre Technologies) according to themanufacturer's instructions, ethanol precipitated, dried and resuspendedin 20 μL of water. Appropriate oligonucleotide adapters may be ligatedto the fragment using T4 DNA ligase (New England Biolabs, Beverly,Mass.). The fragment containing the ligated adapters can be purifiedfrom the excess adapters using low melting agarose as described above.The vector pBT430 is digested, dephosphorylated with alkalinephosphatase (NEB) and deproteinized with phenol/chloroform as describedabove. The prepared vector pBT430 and fragment can then be ligated at16° C. for 15 hours followed by transformation into DH5 electrocompetentcells (GIBCO BRL). Transformants can be selected on agar platescontaining LB media and 100 μg/mL ampicillin. Transformants containingthe gene encoding the instant polypeptides are then screened for thecorrect orientation with respect to the T7 promoter by restrictionenzyme analysis.

[0100] For high level expression, a plasmid clone with the cDNA insertin the correct orientation relative to the T7 promoter can betransformed into E. coli strain BL21(DE3) (Studier et al. (1986) J. Mol.Biol. 189:113-130). Cultures are grown in LB medium containingampicillin (100 mg/L) at 25° C. At an optical density at 600 nm ofapproximately 1, IPTG (isopropylthio-β-galactoside, the inducer) can beadded to a final concentration of 0.4 mM and incubation can be continuedfor 3 h at 25°. Cells are then harvested by centrifugation andre-suspended in 50 μL of 50 mM Tris-HCl at pH 8.0 containing 0.1 mM DTTand 0.2 mM phenyl methylsulfonyl fluoride. A small amount of 1 mm glassbeads can be added and the mixture sonicated 3 times for about 5 secondseach time with a microprobe sonicator. The mixture is centrifuged andthe protein concentration of the supernatant determined. One μg ofprotein from the soluble fraction of the culture can be separated bySDS-polyacrylamide gel electrophoresis. Gels can be observed for proteinbands migrating at the expected molecular weight.

EXAMPLE 7 Evaluating Compounds for Their Ability to Inhibit the Activityof Novel Plant Tryptophan Synthase Beta Subunits

[0101] The polypeptides described herein may be produced using anynumber of methods known to those skilled in the art. Such methodsinclude, but are not limited to, expression in bacteria as described inExample 6, or expression in eukaryotic cell culture, in planta, andusing viral expression systems in suitably infected organisms or celllines. The instant polypeptides may be expressed either as mature formsof the proteins as observed in vivo or as fusion proteins by covalentattachment to a variety of enzymes, proteins or affinity tags. Commonfusion protein partners include glutathione S-transferase (“GST”),thioredoxin (“Trx”), maltose binding protein, and C- and/or N-terminalhexahistidine polypeptide (“(His)₆”). The fusion proteins may beengineered with a protease recognition site at the fusion point so thatfusion partners can be separated by protease digestion to yield intactmature enzyme. Examples of such proteases include thrombin, enterokinaseand factor Xa. However, any protease can be used which specificallycleaves the peptide connecting the fusion protein and the enzyme.

[0102] Purification of the instant polypeptides, if desired, may utilizeany number of separation technologies familiar to those skilled in theart of protein purification. Examples of such methods include, but arenot limited to, homogenization, filtration, centrifugation, heatdenaturation, ammonium sulfate precipitation, desalting, pHprecipitation, ion exchange chromatography, hydrophobic interactionchromatography and affinity chromatography, wherein the affinity ligandrepresents a substrate, substrate analog or inhibitor. When the instantpolypeptides are expressed as fusion proteins, the purification protocolmay include the use of an affinity resin which is specific for thefusion protein tag attached to the expressed enzyme or an affinity resincontaining ligands which are specific for the enzyme. For example, theinstant polypeptides may be expressed as a fusion protein coupled to theC-terminus of thioredoxin. In addition, a (His)₆ peptide may beengineered into the N-terminus of the fused thioredoxin moiety to affordadditional opportunities for affinity purification. Other suitableaffinity resins could be synthesized by linking the appropriate ligandsto any suitable resin such as Sepharose-4B. In an alternate embodiment,a thioredoxin fusion protein may be eluted using dithiothreitol;however, elution may be accomplished using other reagents which interactto displace the thioredoxin from the resin. These reagents includeβ-mercaptoethanol or other reduced thiol. The eluted fusion protein maybe subjected to further purification by traditional means as statedabove, if desired. Proteolytic cleavage of the thioredoxin fusionprotein and the enzyme may be accomplished after the fusion protein ispurified or while the protein is still bound to the ThioBond™ affinityresin or other resin.

[0103] Crude, partially purified or purified enzyme, either alone or asa fusion protein, may be utilized in assays for the evaluation ofcompounds for their ability to inhibit enzymatic activation of theinstant polypeptides disclosed herein. Assays may be conducted underwell known experimental conditions which permit optimal enzymaticactivity. Assays for tryptophan synthase beta subunit are presented byPalombella and Dutcher (1998) Plant Physiol. 117:455-464.

1 13 1 1739 DNA Zea mays 1 gcacgaggga aggcagtcac ttctccagga gcccggagtcggcagcgaga gagaatggcc 60 gccgccgccg ccgccaccac ccttcgtact gctctctcccactcccaagc aacagggcaa 120 gagcagagag cttcactgct ttgcacaccg gagcaccgagttgctgccag caggagaagc 180 ttgagattca ccactagggc cagctcgaat gcgggcgccagtgtgagcat cccgaagcaa 240 tggtacaacc tcatcgccga cctgccggtg aagccaccgccaccgctgca cccgcagaca 300 caccagcctc tgaatcccag cgacctctcc cctctgttccccgacgagct gatcaggcag 360 gaggtcaccg acgagcggtt cgtcgacata cccgaggagatcatcgacgt gtacaagctc 420 tggcgcccga cgcccctgat cagggccagg aggctggagaagctgctcgg cacgccggcc 480 aagatctact acaagtacga ggggaccagc ccggcggggtcgcacaagcc caacaccgcc 540 gtgccgcagg cgtggtacaa cgccgcggcg ggggtgaagaacgtggtcac cgagaccggc 600 gccggccagt ggggcagcgc gctgtccttc gccagcagcctcttcggcct taactgcgag 660 gtatggcagg tgcgcgcgtc gttcgaccag aagccgtaccggaggctgat gatggagacg 720 tggggcgcca aggtgcaccc gtcgccgtcg acggcgacggaggccggcaa gaggatcctg 780 gaggcggacc cgtccagccc gggcagcctc gggatcgccatctccgaggc ggtggaggtg 840 gcggccacca gcgccgacac caagtactgc ctgggcagcgtgctcaacca cgtcctgctc 900 caccagaccg tcatcgggga ggagtgcctg gagcagctagcggcgctcgg cgagacgccc 960 gacgtcgtca tcggctgcac cggcggcggg tccaacttcggcgggctcgc gttcccgttc 1020 ttgcgcgaga agctgcgcgg caggatgagc cccgcgttcagggccgtgga gcccgccgcg 1080 tgccccacgc tcaccaaggg cgtctacgcg tacgacttcggcgacacggc cgggctcacg 1140 ccgctgatga agatgcacac cctcggccac ggcttcgtccccgacccgat ccatgcaggt 1200 gggcttcggt accatggaat ggcacctctg atctcgcacgtgtacgagct gggcttcatg 1260 gatgccgttg ctatacagca gactgaatgc ttccaagctgccttgcaatt cgccaggacg 1320 gagggcatca tcccggcgcc ggagccgacg cacgcaatcgccgcggcgat cagggaggcg 1380 ctggagtgca agcggaccgg ggaggagaag gtcatcctgatggccatgtg cgggcacgga 1440 catttcgacc tcgccgcgta cgagaagtac ctgaggggagacatggtcga tctctcgcac 1500 ccggcggaga agctggaggc ctccctcgct gccgtgcccaaagtctgacg gcgttggagc 1560 caactgcaca tgcgactgga atgggacgaa taatccattgatatcaggtt cttgaatctt 1620 gtggtgatcc atcgcccatc ggcagtggga tacttgtgttccttatgaaa tgaatgaata 1680 aaatttcaat aaaagcattt attttatcaa aaaaaaaaaaaaaaaaaaaa aaaaaaaaa 1739 2 497 PRT Zea mays 2 Met Ala Ala Ala Ala AlaAla Thr Thr Leu Arg Thr Ala Leu Ser His 1 5 10 15 Ser Gln Ala Thr GlyGln Glu Gln Arg Ala Ser Leu Leu Cys Thr Pro 20 25 30 Glu His Arg Val AlaAla Ser Arg Arg Ser Leu Arg Phe Thr Thr Arg 35 40 45 Ala Ser Ser Asn AlaGly Ala Ser Val Ser Ile Pro Lys Gln Trp Tyr 50 55 60 Asn Leu Ile Ala AspLeu Pro Val Lys Pro Pro Pro Pro Leu His Pro 65 70 75 80 Gln Thr His GlnPro Leu Asn Pro Ser Asp Leu Ser Pro Leu Phe Pro 85 90 95 Asp Glu Leu IleArg Gln Glu Val Thr Asp Glu Arg Phe Val Asp Ile 100 105 110 Pro Glu GluIle Ile Asp Val Tyr Lys Leu Trp Arg Pro Thr Pro Leu 115 120 125 Ile ArgAla Arg Arg Leu Glu Lys Leu Leu Gly Thr Pro Ala Lys Ile 130 135 140 TyrTyr Lys Tyr Glu Gly Thr Ser Pro Ala Gly Ser His Lys Pro Asn 145 150 155160 Thr Ala Val Pro Gln Ala Trp Tyr Asn Ala Ala Ala Gly Val Lys Asn 165170 175 Val Val Thr Glu Thr Gly Ala Gly Gln Trp Gly Ser Ala Leu Ser Phe180 185 190 Ala Ser Ser Leu Phe Gly Leu Asn Cys Glu Val Trp Gln Val ArgAla 195 200 205 Ser Phe Asp Gln Lys Pro Tyr Arg Arg Leu Met Met Glu ThrTrp Gly 210 215 220 Ala Lys Val His Pro Ser Pro Ser Thr Ala Thr Glu AlaGly Lys Arg 225 230 235 240 Ile Leu Glu Ala Asp Pro Ser Ser Pro Gly SerLeu Gly Ile Ala Ile 245 250 255 Ser Glu Ala Val Glu Val Ala Ala Thr SerAla Asp Thr Lys Tyr Cys 260 265 270 Leu Gly Ser Val Leu Asn His Val LeuLeu His Gln Thr Val Ile Gly 275 280 285 Glu Glu Cys Leu Glu Gln Leu AlaAla Leu Gly Glu Thr Pro Asp Val 290 295 300 Val Ile Gly Cys Thr Gly GlyGly Ser Asn Phe Gly Gly Leu Ala Phe 305 310 315 320 Pro Phe Leu Arg GluLys Leu Arg Gly Arg Met Ser Pro Ala Phe Arg 325 330 335 Ala Val Glu ProAla Ala Cys Pro Thr Leu Thr Lys Gly Val Tyr Ala 340 345 350 Tyr Asp PheGly Asp Thr Ala Gly Leu Thr Pro Leu Met Lys Met His 355 360 365 Thr LeuGly His Gly Phe Val Pro Asp Pro Ile His Ala Gly Gly Leu 370 375 380 ArgTyr His Gly Met Ala Pro Leu Ile Ser His Val Tyr Glu Leu Gly 385 390 395400 Phe Met Asp Ala Val Ala Ile Gln Gln Thr Glu Cys Phe Gln Ala Ala 405410 415 Leu Gln Phe Ala Arg Thr Glu Gly Ile Ile Pro Ala Pro Glu Pro Thr420 425 430 His Ala Ile Ala Ala Ala Ile Arg Glu Ala Leu Glu Cys Lys ArgThr 435 440 445 Gly Glu Glu Lys Val Ile Leu Met Ala Met Cys Gly His GlyHis Phe 450 455 460 Asp Leu Ala Ala Tyr Glu Lys Tyr Leu Arg Gly Asp MetVal Asp Leu 465 470 475 480 Ser His Pro Ala Glu Lys Leu Glu Ala Ser LeuAla Ala Val Pro Lys 485 490 495 Val 3 1817 DNA Glycine max unsure(1809)..(1810)..(1811)..(1812) unsure (1814) 3 gcacgaggtt ttccctgttaatgcacttcc tactccttca ccatgtttcc actccaaagt 60 tggaaagcaa tggcctcagggatttgcctt gagtgtgagg ccaacgaatc cgaagaggct 120 ttcaagtgcc tgcaaagtgagagcaacttt gggtgcttct gataaatcaa ttggaattcc 180 caaccaatgg tacaatgtaattgcagatct tccagtgaaa ccacctccac cattgcatcc 240 caagacttat gaaccaatcaaaccagatga cttgtcaccc ctttttcctg atgagttaat 300 cagacaagag atcgccagtgacagattcat agacatacca gatgaagttc ttgatgttta 360 caagctttgg cgcccgacccctctcattag agccaagagg ctggaaaagc ttcttgatac 420 gccggctagg atttactacaagtatgaagg tgtaagcccc gctggatcac acaaaccaaa 480 ctctgctgtt ccacaagcctggtataattt acaagcaggt gtcaagaatg ttgtgacaga 540 aactggtgct ggacagtggggaagtgcatt ggcctttgcg tgcagcatat ttggtcttgg 600 ctgtgaggtg tggcaagtacgtgcttctta tgattcaaaa ccatatcgga gattgatgat 660 gcaaacatgg ggtgcaaaggtacacccatc tccatctatg attactgagg caggtcggag 720 aatgcttcaa gaggatccatcaagcccagg gagtttaggc atagccatat cagaagctgt 780 ggaggttgct gctaaaaatgctgataccaa gtactgcttg gggagtgtac tcaatcatgt 840 tttactccac cagagtgttataggagaaga gtgcatcaaa caaatggaag ctattgggga 900 aaccccagat gtcattataggatgtactgg tggtggctcc aactttgcag gacttagttt 960 cccgttcctt cgagagaagctcaataaaaa aatcaatcct gttataagag cggttgaacc 1020 tgcagcatgt ccttcattaacaaaaggggt atatacttat gattatggtg atacagcagg 1080 gatgactcca ttgatgaaaatgcacacact tggacacgac tttgttccgg atccaattca 1140 tgctgggggt ttgcgttaccacggtatggc accattgatc tcacatgttt ttgacttggg 1200 tttaatggaa gcaattgcaattccacaaac agaatgtttt caaggggcta tacagtttgc 1260 caggtctgaa gggttgataccagctcctga accaactcat gccatagctg caaccattag 1320 ggaagctatt cgttgtagagaggctggaga ggccaaggtt attctgacag caatgtgtgg 1380 acatggccat tttgatctgccagcttatga aaagtacttg caaggtaaca tggttgacct 1440 ctcattctca gaagacaagatgaaagcttc actggccaat attcctcaag tgattacctg 1500 agttgaggct cattctattgtagtacagtg aggaacaagg aagacataat agtactttac 1560 ttgggaccaa aatgtatggttctctgaaca catatatgta tctgagtttg ttttaggcaa 1620 catttgatcc atgccaaggaaggtgcaact agtagttttt atgaattttt tttctttcaa 1680 gattgatgtg aaaatatagagtcttgcatt ttcaacgtgt gtccaacaca cttagagcat 1740 gtttgtttcc ttgttctaactgcagtgcac gaattccaat gagtaaaacg aaaaagtaac 1800 caagttaann nnanaaa 18174 499 PRT Glycine max 4 His Glu Val Phe Pro Val Asn Ala Leu Pro Thr ProSer Pro Cys Phe 1 5 10 15 His Ser Lys Val Gly Lys Gln Trp Pro Gln GlyPhe Ala Leu Ser Val 20 25 30 Arg Pro Thr Asn Pro Lys Arg Leu Ser Ser AlaCys Lys Val Arg Ala 35 40 45 Thr Leu Gly Ala Ser Asp Lys Ser Ile Gly IlePro Asn Gln Trp Tyr 50 55 60 Asn Val Ile Ala Asp Leu Pro Val Lys Pro ProPro Pro Leu His Pro 65 70 75 80 Lys Thr Tyr Glu Pro Ile Lys Pro Asp AspLeu Ser Pro Leu Phe Pro 85 90 95 Asp Glu Leu Ile Arg Gln Glu Ile Ala SerAsp Arg Phe Ile Asp Ile 100 105 110 Pro Asp Glu Val Leu Asp Val Tyr LysLeu Trp Arg Pro Thr Pro Leu 115 120 125 Ile Arg Ala Lys Arg Leu Glu LysLeu Leu Asp Thr Pro Ala Arg Ile 130 135 140 Tyr Tyr Lys Tyr Glu Gly ValSer Pro Ala Gly Ser His Lys Pro Asn 145 150 155 160 Ser Ala Val Pro GlnAla Trp Tyr Asn Leu Gln Ala Gly Val Lys Asn 165 170 175 Val Val Thr GluThr Gly Ala Gly Gln Trp Gly Ser Ala Leu Ala Phe 180 185 190 Ala Cys SerIle Phe Gly Leu Gly Cys Glu Val Trp Gln Val Arg Ala 195 200 205 Ser TyrAsp Ser Lys Pro Tyr Arg Arg Leu Met Met Gln Thr Trp Gly 210 215 220 AlaLys Val His Pro Ser Pro Ser Met Ile Thr Glu Ala Gly Arg Arg 225 230 235240 Met Leu Gln Glu Asp Pro Ser Ser Pro Gly Ser Leu Gly Ile Ala Ile 245250 255 Ser Glu Ala Val Glu Val Ala Ala Lys Asn Ala Asp Thr Lys Tyr Cys260 265 270 Leu Gly Ser Val Leu Asn His Val Leu Leu His Gln Ser Val IleGly 275 280 285 Glu Glu Cys Ile Lys Gln Met Glu Ala Ile Gly Glu Thr ProAsp Val 290 295 300 Ile Ile Gly Cys Thr Gly Gly Gly Ser Asn Phe Ala GlyLeu Ser Phe 305 310 315 320 Pro Phe Leu Arg Glu Lys Leu Asn Lys Lys IleAsn Pro Val Ile Arg 325 330 335 Ala Val Glu Pro Ala Ala Cys Pro Ser LeuThr Lys Gly Val Tyr Thr 340 345 350 Tyr Asp Tyr Gly Asp Thr Ala Gly MetThr Pro Leu Met Lys Met His 355 360 365 Thr Leu Gly His Asp Phe Val ProAsp Pro Ile His Ala Gly Gly Leu 370 375 380 Arg Tyr His Gly Met Ala ProLeu Ile Ser His Val Phe Asp Leu Gly 385 390 395 400 Leu Met Glu Ala IleAla Ile Pro Gln Thr Glu Cys Phe Gln Gly Ala 405 410 415 Ile Gln Phe AlaArg Ser Glu Gly Leu Ile Pro Ala Pro Glu Pro Thr 420 425 430 His Ala IleAla Ala Thr Ile Arg Glu Ala Ile Arg Cys Arg Glu Ala 435 440 445 Gly GluAla Lys Val Ile Leu Thr Ala Met Cys Gly His Gly His Phe 450 455 460 AspLeu Pro Ala Tyr Glu Lys Tyr Leu Gln Gly Asn Met Val Asp Leu 465 470 475480 Ser Phe Ser Glu Asp Lys Met Lys Ala Ser Leu Ala Asn Ile Pro Gln 485490 495 Val Ile Thr 5 543 DNA Triticum aestivum unsure (377) unsure(480) unsure (482) unsure (534) 5 ggttgactac ccactgagca gagcagctttcggggcgcga tccaggaagc caggaacggc 60 aaagcaggcg gcgcggcggc ggctccatcgtgacgacaag aaatggccac cgccctccgc 120 cctccccggc tcccagcagt tccagagcaagcctcttcac ttcatcgcct accaaagtac 180 agagttgccg tcactgggcg taggagcttcgccgccaggg ccggctcgta tccaggcaac 240 gtgggcgtcc cgaagcaatg gtacaacctcatcgccgacc tgccggtgaa gccgccgccg 300 atgctgcacc cggggaccac cagccgctgaaccccagcga cctggcccct ctcttccccg 360 acgagctcat caagcangac tcacggaggagcgcttcatc gacatacccg acaagtccgg 420 gatgtctaca actctggcgc ccgacccactgatcaagggc aagaggctgg agaactgtcn 480 gnacccggga agtctactac aagtacgagggactaacccg cgggtccaca aggnaacacg 540 cct 543 6 147 PRT Triticum aestivumUNSURE (73) UNSURE (92) UNSURE (95) UNSURE (145) 6 Met Ala Thr Ala LeuArg Pro Pro Arg Leu Pro Ala Val Pro Glu Gln 1 5 10 15 Ala Ser Ser LeuHis Arg Leu Pro Lys Tyr Arg Val Ala Val Thr Gly 20 25 30 Arg Arg Ser PheAla Ala Arg Ala Gly Ser Tyr Pro Gly Asn Val Gly 35 40 45 Val Pro Lys GlnTrp Tyr Asn Leu Ile Ala Asp Leu Pro Val Lys Pro 50 55 60 Pro Pro Met LeuHis Pro Gly Thr Xaa Gln Pro Leu Asn Pro Ser Asp 65 70 75 80 Leu Ala ProLeu Phe Pro Asp Glu Leu Ile Lys Xaa Asp Ser Xaa Glu 85 90 95 Glu Arg PheIle Asp Ile Pro Asp Lys Ser Gly Met Ser Thr Thr Leu 100 105 110 Ala ProAsp Pro Leu Ile Lys Gly Lys Arg Leu Glu Asn Cys Arg Thr 115 120 125 ArgGlu Val Tyr Tyr Lys Tyr Glu Gly Leu Thr Arg Gly Ser Thr Arg 130 135 140Xaa His Ala 145 7 434 PRT Aquifex aeolicus 7 Met Arg Lys Phe Leu Leu SerGlu Gly Glu Ile Pro Lys Lys Trp Leu 1 5 10 15 Asn Ile Leu Pro Leu LeuPro Glu Pro Leu Glu Pro Pro Leu Asp Pro 20 25 30 Glu Thr Met Glu Pro ValLys Pro Glu Lys Leu Leu Ala Ile Phe Pro 35 40 45 Glu Pro Leu Val Glu GlnGlu Val Ser Asp Lys Glu Trp Ile Asp Ile 50 55 60 Pro Glu Glu Val Leu AspIle Tyr Ser Leu Trp Arg Pro Thr Pro Leu 65 70 75 80 His Arg Ala Lys AsnLeu Glu Glu Phe Leu Gly Thr Pro Ala Lys Ile 85 90 95 Phe Tyr Lys Asn GluSer Val Ser Pro Pro Gly Ser His Lys Pro Asn 100 105 110 Thr Ala Val AlaGln Ala Tyr Tyr Asn Lys Ile Ser Gly Val Lys Arg 115 120 125 Leu Thr ThrGlu Thr Gly Ala Gly Gln Trp Gly Ser Ala Leu Ser Phe 130 135 140 Ala ThrGln Phe Phe Asp Leu Gln Cys Arg Val Tyr Met Val Arg Val 145 150 155 160Ser Tyr Asn Gln Lys Pro Tyr Arg Arg Ile Leu Met Glu Thr Trp Lys 165 170175 Gly Glu Val Ile Pro Ser Pro Ser Pro Tyr Thr Asn Ala Gly Arg Lys 180185 190 Tyr Tyr Glu Glu Asn Pro Glu His Pro Gly Ser Leu Gly Ile Ala Ile195 200 205 Ser Glu Ala Ile Glu Glu Ala Ala Ser Arg Glu Asp Thr Lys TyrSer 210 215 220 Leu Gly Ser Val Leu Asn His Val Leu Leu His Gln Thr ValIle Gly 225 230 235 240 Leu Glu Ala Lys Lys Gln Met Glu Glu Ala Gly TyrTyr Pro Asp Val 245 250 255 Ile Ile Gly Ala Val Gly Gly Gly Ser Asn PheAla Gly Leu Ser Phe 260 265 270 Pro Phe Leu Ala Asp Val Leu Arg Gly AspLys Arg Lys Glu Asp Leu 275 280 285 Lys Val Leu Ala Val Glu Pro Glu AlaCys Pro Thr Leu Thr Lys Gly 290 295 300 Glu Tyr Lys Tyr Asp Phe Gly AspSer Val Gly Leu Thr Pro Leu Ile 305 310 315 320 Lys Met Tyr Thr Leu GlyHis Asp Phe Val Pro Ser Pro Ile His Ala 325 330 335 Gly Gly Leu Arg TyrHis Gly Asp Ala Pro Leu Val Cys Lys Leu Tyr 340 345 350 Asn Leu Gly TyrIle Asp Ala Val Ala Tyr Lys Gln Thr Glu Val Phe 355 360 365 Glu Ala AlaVal Thr Phe Ala Arg Thr Glu Gly Ile Val Pro Ala Pro 370 375 380 Glu SerAla His Ala Ile Lys Ala Ala Ile Asp Glu Ala Leu Lys Cys 385 390 395 400Lys Glu Thr Gly Glu Glu Lys Val Ile Leu Phe Asn Leu Ser Gly His 405 410415 Gly Tyr Phe Asp Leu Ser Ala Tyr Asp Lys Tyr Leu His Gly Glu Leu 420425 430 Thr Asp 8 389 PRT Zea mays 8 Gly Arg Phe Gly Gly Lys Tyr Val ProGlu Thr Leu Met His Ala Leu 1 5 10 15 Thr Glu Leu Glu Asn Ala Phe HisAla Leu Ala Thr Asp Asp Glu Phe 20 25 30 Gln Lys Glu Leu Asp Gly Ile LeuLys Asp Tyr Val Gly Arg Glu Ser 35 40 45 Pro Leu Tyr Phe Ala Glu Arg LeuThr Glu His Tyr Lys Arg Ala Asp 50 55 60 Gly Thr Gly Pro Leu Ile Tyr LeuLys Arg Glu Asp Leu Asn His Arg 65 70 75 80 Gly Ala His Lys Ile Asn AsnAla Val Ala Gln Ala Leu Leu Ala Lys 85 90 95 Arg Leu Gly Lys Gln Arg IleIle Ala Glu Thr Gly Ala Gly Gln His 100 105 110 Gly Val Ala Thr Ala ThrVal Cys Ala Arg Phe Gly Leu Gln Cys Ile 115 120 125 Ile Tyr Met Gly AlaGln Asp Met Glu Arg Gln Ala Leu Asn Val Phe 130 135 140 Arg Met Lys LeuLeu Gly Ala Glu Val Arg Ala Val His Ser Gly Thr 145 150 155 160 Ala ThrLeu Lys Asp Ala Thr Ser Glu Ala Ile Arg Asp Trp Val Thr 165 170 175 AsnVal Glu Thr Thr His Tyr Ile Leu Gly Ser Val Ala Gly Pro His 180 185 190Pro Tyr Pro Met Met Val Arg Glu Phe His Lys Val Ile Gly Lys Glu 195 200205 Thr Arg Arg Gln Ala Met His Lys Trp Gly Gly Lys Pro Asp Val Leu 210215 220 Val Ala Cys Val Gly Gly Gly Ser Asn Ala Met Gly Leu Phe His Glu225 230 235 240 Phe Val Glu Asp Gln Asp Val Arg Leu Ile Gly Val Glu AlaAla Gly 245 250 255 His Gly Val Asp Thr Asp Lys His Ala Ala Thr Leu ThrLys Gly Gln 260 265 270 Val Gly Val Leu His Gly Ser Met Ser Tyr Leu LeuGln Asp Asp Asp 275 280 285 Gly Gln Val Ile Glu Pro His Ser Ile Ser AlaGly Leu Asp Tyr Pro 290 295 300 Gly Val Gly Pro Glu His Ser Phe Leu LysAsp Ile Gly Arg Ala Glu 305 310 315 320 Tyr Asp Ser Val Thr Asp Gln GluAla Leu Asp Ala Phe Lys Arg Val 325 330 335 Ser Arg Leu Glu Gly Ile IlePro Ala Leu Glu Thr Ser His Ala Leu 340 345 350 Ala Tyr Leu Glu Lys LeuCys Pro Thr Leu Pro Asp Gly Val Arg Val 355 360 365 Val Leu Asn Cys SerGly Arg Gly Asp Lys Asp Val His Thr Ala Ser 370 375 380 Lys Tyr Leu AspVal 385 9 443 PRT Zea mays 9 Pro Gly Pro Pro Pro Pro Ala Pro Glu Gly ArgArg Arg Arg Gly Arg 1 5 10 15 Gly Arg Asn Ala Ala Gly Gln Ala Val AlaAla Glu Ala Ser Pro Ala 20 25 30 Ala Val Glu Met Gly Asn Gly Ala Ala AlaPro Gly Leu Gln Arg Pro 35 40 45 Asp Ala Met Gly Arg Phe Gly Arg Phe GlyGly Lys Tyr Val Pro Glu 50 55 60 Thr Leu Met His Ala Leu Thr Glu Leu GluSer Ala Phe His Ala Leu 65 70 75 80 Ala Thr Asp Asp Glu Phe Gln Lys GluLeu Asp Gly Ile Leu Lys Asp 85 90 95 Tyr Val Gly Arg Glu Ser Pro Leu TyrPhe Ala Glu Arg Leu Thr Glu 100 105 110 His Tyr Lys Arg Ala Asp Gly ThrGly Pro Leu Ile Tyr Leu Lys Arg 115 120 125 Glu Asp Leu Asn His Thr GlyAla His Lys Ile Asn Asn Ala Val Ala 130 135 140 Gln Ala Leu Leu Ala LysArg Leu Gly Lys Gln Arg Ile Ile Ala Glu 145 150 155 160 Thr Gly Ala GlyGln His Gly Val Ala Thr Ala Thr Val Cys Arg Arg 165 170 175 Phe Gly LeuGln Cys Ile Ile Tyr Met Gly Ala Gln Asp Met Glu Arg 180 185 190 Gln AlaLeu Asn Val Phe Arg Met Arg Leu Leu Gly Ala Glu Val Arg 195 200 205 AlaVal His Ser Gly Thr Ala Thr Leu Lys Asp Ala Thr Ser Glu Ala 210 215 220Ile Arg Asp Trp Val Thr Asn Val Glu Thr Thr His Tyr Ile Leu Gly 225 230235 240 Ser Val Ala Gly Pro His Pro Tyr Pro Met Met Val Arg Glu Phe His245 250 255 Lys Val Ile Gly Lys Glu Thr Arg Arg Gln Ala Met Asp Lys TrpGly 260 265 270 Gly Lys Pro Asp Val Leu Val Ala Cys Val Gly Gly Gly SerAsn Ala 275 280 285 Met Gly Leu Phe His Glu Phe Val Glu Asp Gln Asp ValArg Leu Val 290 295 300 Gly Leu Glu Ala Ala Gly His Gly Val Asp Thr AspLys His Ala Ala 305 310 315 320 Thr Leu Thr Lys Gly Gln Val Gly Val LeuHis Gly Ser Met Ser Tyr 325 330 335 Leu Leu Gln Asp Asp Asp Gly Gln ValIle Glu Pro His Ser Ile Ser 340 345 350 Ala Gly Leu Asp Tyr Pro Gly ValGly Pro Glu His Ser Phe Leu Lys 355 360 365 Asp Ile Gly Arg Ala Glu TyrAsp Ser Val Thr Asp Gln Glu Ala Leu 370 375 380 Asp Ala Phe Lys Arg ValSer Arg Leu Glu Gly Ile Ile Pro Ala Leu 385 390 395 400 Glu Thr Ser HisAla Leu Ala Tyr Leu Glu Lys Leu Cys Pro Thr Leu 405 410 415 Ala Asp GlyVal Arg Val Val Val Asn Cys Ser Gly Arg Gly Asp Lys 420 425 430 Asp ValHis Thr Ala Ser Lys Tyr Leu Asp Val 435 440 10 666 DNA Oryza sativaunsure (73) unsure (580) unsure (609) unsure (640) unsure (648) unsure(650) unsure (658) unsure (661) 10 gaagagcgca agccgggaga agcaccaccacctagataga acaataaaaa ttgctgcatc 60 cgtcagagct ggncactaca aaaaccagctgcatcacagg agagagcagg cagagaggca 120 gctctccagc tcgcgtttgg ggttgaccagttactccggg cgaaccgacg gcaagccggc 180 gccggcggcg agggtcgtcg ggagcggaagagagggagaa agaatggcca ccaccgcctc 240 cgtccgacct cccctgctcc gacaagcagcaggttcagaa aaagcctcac tcctttgcaa 300 accaaagcag agagcttctg tcagaagaagaagcttcact gccagggcca gctcgaatcc 360 tgtgagcatc ccgaagcaat ggtacaacctcgtcgccgac ctgccggtga agccaccgcc 420 gccgctgcac ccgcagacgc accagccactgaaccctagt gacctgtccc ctctcttccc 480 cgacgagctg atcaggcagg aggtgaccgaggagcggttc atcgacatcc cggaggaggt 540 cgccgaggtt tacaagctct ggcgcccgacgccgctgatn aaggcgaggg aggctggaga 600 agctgctgng cacgccggcg aatatttactacaaagtacn aaggggancn agcccggngg 660 nggtcc 666 11 144 PRT Oryza sativaUNSURE (119) UNSURE (129) UNSURE (139) UNSURE (142) 11 Met Ala Thr ThrAla Ser Val Arg Pro Pro Leu Leu Arg Gln Ala Ala 1 5 10 15 Gly Ser GluLys Ala Ser Leu Leu Cys Lys Pro Lys Gln Arg Ala Ser 20 25 30 Val Arg ArgArg Ser Phe Thr Ala Arg Ala Ser Ser Asn Pro Val Ser 35 40 45 Ile Pro LysGln Trp Tyr Asn Leu Val Ala Asp Leu Pro Val Lys Pro 50 55 60 Pro Pro ProLeu His Pro Gln Thr His Gln Pro Leu Asn Pro Ser Asp 65 70 75 80 Leu SerPro Leu Phe Pro Asp Glu Leu Ile Arg Gln Glu Val Thr Glu 85 90 95 Glu ArgPhe Ile Asp Ile Pro Glu Glu Val Ala Glu Val Tyr Lys Leu 100 105 110 TrpArg Pro Thr Pro Leu Xaa Arg Arg Gly Arg Leu Glu Lys Leu Leu 115 120 125Xaa Thr Pro Ala Asn Ile Tyr Tyr Lys Val Xaa Arg Gly Xaa Ser Pro 130 135140 12 1709 DNA Triticum aestivum 12 gcacgagggt tgactaccca ctgagcagagcagctttcgg ggcgcgatcc aggaagccag 60 gaacggcaaa gcaggcggcg cggcggcggctccatcgtga cgacaagaaa tggccaccgc 120 cctccgccct ccccggctcc cagcagttccagagcaagcc tcttcacttc atcgcctacc 180 aaagtacaga gttgccgtca ctgggcgtaggagcttcgcc gccagggccg gctcgtatcc 240 aggcaacgtg ggcgtcccga agcaatggtacaacctcatc gccgacctgc cggtgaagcc 300 gccgccgatg ctgcacccgg ggacgcaccagccgctgaac cccagcgacc tggcccctct 360 cttccccgac gagctcatca ggcaggagctcacggaggag cgcttcatcg acatacccga 420 cgaggtccgg gatgtctacg agctctggcgcccgacgcca ctgatcaggg ccaagaggct 480 ggagaagctg ctcggcacgc cggcgaagatctactacaag tacgagggca ctagcccggc 540 ggggtcgcac aagggcaaca ccgccgtgccgcaggcgtgg tacaacgccg cggcgggggt 600 caagaacgtg gtcaccgaga ccggcgccggccagtggggc agcgcgctct ccttcgccag 660 caccctcttc ggcctcaact gcgaggtgtggcaggtgcgc gcgtcctacg accagaagcc 720 gtaccggagg ctgatgatgg agacgtggggcgccaaggtg cacccgtcgc cgtccgacgt 780 gacggaggcc ggcaggaagc tcctggcggcggacccggcc agcccgggga gcctcgggat 840 ggccatctcc gaggcggtgg aggtcgcggccaccaacgcc gacaccaagt actgcctcgg 900 cagcgtgctc aaccacgtcc tgctgcaccagaccgtcatc ggcgaggagt gcctggagca 960 gctggcggcc atcggcgaca ccccggacgtcgtcatcggc tgcaccggcg gcggctccaa 1020 cttcggcggg ctcgccttcc ccttcatgcgcgagaagctg gccggcagga tgagcccgca 1080 gttcaaggcc gtggagcccg cggcgtgccccacgctcacc aagggcgtct acgcctacga 1140 ctacggcgac acggccgggc tgacgccgctcatgaagatg cacaccctcg gccacgactt 1200 tgtccccgat cccatccatg caggtgggcttcgctaccat ggaatggcgc ctctgatttc 1260 ccatgtgtat gagctcgggt tcatggaggccatgtccata cagcaaactg agtgcttcga 1320 agctgcattg caatttgcac ggacggagggcatcatccca gcgccggagc cgacgcacgc 1380 gatcgcggcg gcgatcaggg aagcgctggagtgcaagagg accggggagg agaaggtcat 1440 cctcatcgcc atgtgcggcc acggccacttcgacctcgcc gcctacgacc ggtacctgag 1500 aggcgacatg attgatctct cgcactcctccgagaagctc aaggagtctc tgggtgccat 1560 tcccaaagtc tgatgctaga gattcagagattgatagaag aacggtttgg gaagtgggaa 1620 tacaataaga tgaacaatgt gacgctttcttggtgcatgg cacacataaa tttgatcaat 1680 aaaagatgtt accttttggc taaaaaaaa1709 13 487 PRT Triticum aestivum 13 Met Ala Thr Ala Leu Arg Pro Pro ArgLeu Pro Ala Val Pro Glu Gln 1 5 10 15 Ala Ser Ser Leu His Arg Leu ProLys Tyr Arg Val Ala Val Thr Gly 20 25 30 Arg Arg Ser Phe Ala Ala Arg AlaGly Ser Tyr Pro Gly Asn Val Gly 35 40 45 Val Pro Lys Gln Trp Tyr Asn LeuIle Ala Asp Leu Pro Val Lys Pro 50 55 60 Pro Pro Met Leu His Pro Gly ThrHis Gln Pro Leu Asn Pro Ser Asp 65 70 75 80 Leu Ala Pro Leu Phe Pro AspGlu Leu Ile Arg Gln Glu Leu Thr Glu 85 90 95 Glu Arg Phe Ile Asp Ile ProAsp Glu Val Arg Asp Val Tyr Glu Leu 100 105 110 Trp Arg Pro Thr Pro LeuIle Arg Ala Lys Arg Leu Glu Lys Leu Leu 115 120 125 Gly Thr Pro Ala LysIle Tyr Tyr Lys Tyr Glu Gly Thr Ser Pro Ala 130 135 140 Gly Ser His LysGly Asn Thr Ala Val Pro Gln Ala Trp Tyr Asn Ala 145 150 155 160 Ala AlaGly Val Lys Asn Val Val Thr Glu Thr Gly Ala Gly Gln Trp 165 170 175 GlySer Ala Leu Ser Phe Ala Ser Thr Leu Phe Gly Leu Asn Cys Glu 180 185 190Val Trp Gln Val Arg Ala Ser Tyr Asp Gln Lys Pro Tyr Arg Arg Leu 195 200205 Met Met Glu Thr Trp Gly Ala Lys Val His Pro Ser Pro Ser Asp Val 210215 220 Thr Glu Ala Gly Arg Lys Leu Leu Ala Ala Asp Pro Ala Ser Pro Gly225 230 235 240 Ser Leu Gly Met Ala Ile Ser Glu Ala Val Glu Val Ala AlaThr Asn 245 250 255 Ala Asp Thr Lys Tyr Cys Leu Gly Ser Val Leu Asn HisVal Leu Leu 260 265 270 His Gln Thr Val Ile Gly Glu Glu Cys Leu Glu GlnLeu Ala Ala Ile 275 280 285 Gly Asp Thr Pro Asp Val Val Ile Gly Cys ThrGly Gly Gly Ser Asn 290 295 300 Phe Gly Gly Leu Ala Phe Pro Phe Met ArgGlu Lys Leu Ala Gly Arg 305 310 315 320 Met Ser Pro Gln Phe Lys Ala ValGlu Pro Ala Ala Cys Pro Thr Leu 325 330 335 Thr Lys Gly Val Tyr Ala TyrAsp Tyr Gly Asp Thr Ala Gly Leu Thr 340 345 350 Pro Leu Met Lys Met HisThr Leu Gly His Asp Phe Val Pro Asp Pro 355 360 365 Ile His Ala Gly GlyLeu Arg Tyr His Gly Met Ala Pro Leu Ile Ser 370 375 380 His Val Tyr GluLeu Gly Phe Met Glu Ala Met Ser Ile Gln Gln Thr 385 390 395 400 Glu CysPhe Glu Ala Ala Leu Gln Phe Ala Arg Thr Glu Gly Ile Ile 405 410 415 ProAla Pro Glu Pro Thr His Ala Ile Ala Ala Ala Ile Arg Glu Ala 420 425 430Leu Glu Cys Lys Arg Thr Gly Glu Glu Lys Val Ile Leu Ile Ala Met 435 440445 Cys Gly His Gly His Phe Asp Leu Ala Ala Tyr Asp Arg Tyr Leu Arg 450455 460 Gly Asp Met Ile Asp Leu Ser His Ser Ser Glu Lys Leu Lys Glu Ser465 470 475 480 Leu Gly Ala Ile Pro Lys Val 485

What is claimed is:
 1. An isolated polynucleotide comprising anucleotide sequence selected from the group consisting of: (a) a firstnucleotide sequence of at least 700 nucleotides from SEQ ID NO:1; (b) asecond nucleotide sequence of at least 420 nucleotides from the groupconsisting of SEQ ID NOs:3, 5, 10, and 12; (c) a third nucleotidesequence encoding a polypeptide of at least 100 amino acids that has atleast 80% identity based on the Clustal method of alignment whencompared to a polypeptide selected from the group consisting of SEQ IDNOs:2, 4, 6, 11, and 13; and (d) a fourth nucleotide sequence comprisinga complement of (a) or (b).
 2. The isolated polynucleotide of claim 1wherein the nucleotide sequences are DNA.
 3. The isolated polynucleotideof claim 1 wherein the nucleotide sequences are RNA.
 4. A chimeric genecomprising the isolated polynucleotide of claim 1 operably linked to atleast one suitable regulatory sequence.
 5. A host cell comprising thechimeric gene of claim
 4. 6. A host cell comprising the isolatedpolynucleotide of claim
 1. 7. The host cell of claim 6 wherein the hostcell is selected from the group consisting of yeast, bacteria, andplant.
 8. A virus comprising the isolated polynucleotide of claim
 1. 9.A polypeptide of at least 100 amino acids that has at least 80% identitybased on the Clustal method of alignment when compared to a polypeptideselected from the group consisting of SEQ ID NOs:2, 4, 6, 11, and 13.10. A method of selecting an isolated polynucleotide that affects thelevel of expression of a plant tryptophan synthase beta subunitpolypeptide in a plant cell, the method comprising the steps of: (a)constructing the isolated polynucleotide comprising a nucleotidesequence of at least one of 30 contiguous nucleotides derived from theisolated polynucleotide of claim 1; (b) introducing the isolatedpolynucleotide into the plant cell; (c) measuring the level of thepolypeptide in the plant cell containing the polynucleotide; and (d)comparing the level of the polypeptide in the plant cell containing theisolated polynucleotide with the level of the polypeptide in a plantcell that does not contain the isolated polynucleotide.
 11. The methodof claim 10 wherein the isolated polynucleotide comprises a nucleotidesequence selected from the group consisting of SEQ ID NOs:1, 3, 5, 10,and
 12. 12. A method of selecting an isolated polynucleotide thataffects the level of expression of a plant tryptophan synthase betasubunit polypeptide in a plant cell, the method comprising the steps of:(a) constructing the isolated polynucleotide of claim 1; (b) introducingthe isolated polynucleotide into the plant cell; (c) measuring the levelof the polypeptide in the plant cell containing the polynucleotide; and(d) comparing the level of the polypeptide in the plant cell containingthe isolated polynucleotide with the level of the polypeptide in a plantcell that does not contain the polynucleotide.
 13. A method of obtaininga nucleic acid fragment encoding a plant tryptophan synthase betasubunit polypeptide comprising the steps of: (a) synthesizing anoligonucleotide primer comprising a nucleotide sequence of at least oneof 30 contiguous nucleotides derived from a nucleotide sequence selectedfrom the group consisting of SEQ ID NOs:1, 3, 5, 10, and 12 and acomplement of such nucleotide sequences; and (b) amplifying a nucleicacid sequence using the oligonucleotide primer.
 14. A method ofobtaining a nucleic acid fragment encoding a plant tryptophan synthasebeta subunit polypeptide comprising the steps of: (a) probing a cDNA orgenomic library with an isolated polynucleotide comprising at least oneof 30 contiguous nucleotides derived from a nucleotide sequence selectedfrom the group consisting of SEQ ID NOs:1, 3, 5, 10, and 12 and acomplement of such nucleotide sequences; (b) identifying a DNA clonethat hybridizes with the isolated polynucleotide; (c) isolating theidentified DNA clone; and (d) sequencing a cDNA or genomic fragment thatcomprises the isolated DNA clone.
 15. A composition comprising theisolated polynucleotide of claim
 1. 16. A composition comprising theisolated polypeptide of claim
 9. 17. The isolated polynucleotide ofclaim 1 comprising a nucleotide sequence having at least one of 30contiguous nucleotides.
 18. A method for positive selection of atransformed cell comprising: (a) transforming a host cell with thechimeric gene of claim 4; and (b) growing the transformed host cellunder conditions which allow expression of a polynucleotide in an amountsufficient to complement a null mutant to provide a positive selectionmeans.
 19. The method of claim 18 wherein the host cell is a plant. 20.The method of claim 19 wherein the plant cell is a monocot.
 21. Themethod of claim 19 wherein the plant cell is a dicot.
 22. A method ofaltering the level of expression of a plant tryptophan synthase betasubunit in a host cell comprising: (a) transforming a host cell with thechimeric gene of claim 4; and (b) growing the transformed host cellproduced in step (a) under conditions that are suitable for expressionof the chimeric gene wherein expression of the chimeric gene results inproduction of altered levels of the plant tryptophan synthase betasubunit in the transformed host cell.
 23. A method for evaluating atleast one compound for its ability to inhibit the activity of a planttryptophan synthase beta subunit, the method comprising the steps of:(a) transforming a host cell with a chimeric gene comprising a nucleicacid fragment encoding a plant tryptophan synthase beta subunitpolypeptide, operably linked to at least one suitable regulatorysequence; (b) growing the transformed host cell under conditions thatare suitable for expression of the chimeric gene wherein expression ofthe chimeric gene results in production of the plant tryptophan synthasebeta subunit polypeptide encoded by the operably linked nucleic acidfragment in the transformed host cell; (c) optionally purifying theplant tryptophan synthase beta subunit polypeptide expressed by thetransformed host cell; (d) treating the plant tryptophan synthase betasubunit polypeptide with a compound to be tested; and (e) comparing theactivity of the plant tryptophan synthase beta subunit polypeptide thathas been treated with the test compound to the activity of an untreatedplant tryptophan synthase beta subunit polypeptide, thereby selectingcompounds with potential for inhibitory activity.